Lots of toolboxes of accelerating MatLab using GPU are available now[1], but, users are confused by which toolbox is best suitable for a particular task. Three toolboxes-Jacket, GPUmat, and parallel Computing Toolbox ...
详细信息
Successive interference cancellation (SIC) is an effective way of multipacket reception (MPR) to combat interference in wireless networks. To understand the potential MPR advantages, we study link scheduling in an ad ...
详细信息
ISBN:
(纸本)9781424499199
Successive interference cancellation (SIC) is an effective way of multipacket reception (MPR) to combat interference in wireless networks. To understand the potential MPR advantages, we study link scheduling in an ad hoc network with SIC at the physical layer. The fact that the links detected sequentially by SIC are correlated at the receiver poses key technical challenges. We characterize the link dependence and propose simultaneity graph (SG) to capture the effect of SIC. Then interference number is defined to measure the interference of a link. We show that scheduling over SG is NP-hard and the maximum interference number bounds the performance of maximal greedy schemes. An independent set based greedy scheme is explored to efficiently construct a maximal feasible schedule. Moreover, with careful selection of link ordering, we present a scheduling scheme that improves the bound. The performance is evaluated by both simulations and measurements in testbed. The throughput gain is on average 40% and up to 120% over IEEE 802.11. The complexity of SG is comparable with that of conflict graph, especially when the network size is not large.
Strongly promoted by the leading industrial companies, cloud computing becomes increasingly popular in re-cent years. The growth rate of cloud computing surpasses even the most optimistic predictions. A cloud applicat...
详细信息
Strongly promoted by the leading industrial companies, cloud computing becomes increasingly popular in re-cent years. The growth rate of cloud computing surpasses even the most optimistic predictions. A cloud application is a large-scale distributed system that consist a lot of distributed cloud nodes. How to make optimal deployment of cloud applications is a challenging research problem. When deploying a cloud application to the cloud environment, cloud node ranking is one of the most important approaches for selecting optimal cloud nodes for the cloud application. Traditional ranking methods usually rank the cloud nodes based on their QoS values, without considering the communication performance between cloud nodes. However, such kind of node relationship is very important for the communication-intensive cloud applications (e.g., Message Passing Interface (MPI) programs), which have a lot of communications between the selected cloud nodes. In this paper, we propose a novel clustering-based method for selecting optimal cloud nodes for deploying communication-intensive applications to the cloud environment. Our method not only takes into account the cloud node qualities, but also the communication performance between different nodes. We deploy several well-known MPI programs on a real-world cloud and compare our method with other methods. The experimental results show the effectiveness of our cluster-based method.
Graph isomorphism problem has always been mathematics and engineering technology community concern, the reason mainly from two aspects: First, in theory, is generally believed that the problem is NP-complete problem; ...
详细信息
Graph isomorphism problem has always been mathematics and engineering technology community concern, the reason mainly from two aspects: First, in theory, is generally believed that the problem is NP-complete problem; Second, the graph isomorphism the problem with good prospects, in chemistry, operations research, computer science, electronics, network theory has applications in many fields, but the exponential complexity of the algorithm and the algorithm itself makes the limitations applicable to the object involved with complex graphics the application of structure is difficult to determine the start. In this paper, class tree is proposed based on the node to delete the exact graph isomorphism problem, you can quickly determine the graph isomorphism problem, and theoretical analysis and experiments show that the algorithm can determine the class in polynomial time tree isomorphism problem.
Lots of toolboxes of accelerating MatLab using GPU are available now[1], but, users are confused by which toolbox is best suitable for a particular task. Three toolboxes-Jacket, GPUmat, and parallel Computing Toolbox ...
详细信息
Lots of toolboxes of accelerating MatLab using GPU are available now[1], but, users are confused by which toolbox is best suitable for a particular task. Three toolboxes-Jacket, GPUmat, and parallel Computing Toolbox of MatLab are selected. For each toolbox, its advantages and pitfalls are reviewed, with an aim to allow the reader to identify which toolbox is appropriate for a given task. Strategies of whether a function should execute on GPU are given after a formula analysis. The analysis is also a framework for program automatically decides which function is cost-efficient to execute on GPU. A series of benchmark of different types of computing, including data transfer between GPU and CPU, data matrix Generation, matrix operation and GPU functions were tested in all three toolboxes. And the results show that Jacket is the best one. Some advices to improve the performance of toolboxes are given in the end.
Multi-core architectures, which have multiple processing units on a single chip, are widely viewed as a way to achieve higher processor performance. Well scheduling of running threads on these processors will result i...
详细信息
Multi-core architectures, which have multiple processing units on a single chip, are widely viewed as a way to achieve higher processor performance. Well scheduling of running threads on these processors will result i...
Multi-core architectures, which have multiple processing units on a single chip, are widely viewed as a way to achieve higher processor performance. Well scheduling of running threads on these processors will result in achieving higher performance. Modern multi-core systems are designed to allow clusters of cores to share various hardware structures, such as last-level caches, memory controllers, and interconnections, as well as prefetching hardware. Without considering these shared resources, scheduling the threads will cause serious degradation in overall performance of the system. In this paper we propose a novel algorithm to schedule the threads that considers these potential contentions to keep away from. The simulation results showed that the proposed scheduler would avoid from lots of contentions between threads on various resources especially on shared caches.
The networked application environment has motivated the development of multitasking operating systems for sensor networks and other low-power electronic devices, but their multitasking capability is severely limited b...
详细信息
ISBN:
(纸本)9781424472611;9780769540597
The networked application environment has motivated the development of multitasking operating systems for sensor networks and other low-power electronic devices, but their multitasking capability is severely limited because traditional stack management techniques perform poorly on small-memory systems. In this paper, we show that combining binary translation and a new kernel runtime can lead to efficient OS designs on resource-constrained platforms. We introduce SenSmart, a multitasking OS for sensor networks, and present new OS design techniques for supporting preemptive multi-task scheduling, memory isolation, and versatile stack management. We have implemented SenSmart on MICA2/MICAz motes. Evaluation shows that SenSmart performs efficient binary translation and demonstrates a significantly better capability in managing concurrent tasks than other sensornet operating systems.
Recent micro-architectural research has proposed various schemes to enhance processors with additional tags to track various properties of a program. Such a technique, which is usually referred to as information flow ...
详细信息
ISBN:
(纸本)9781605587981
Recent micro-architectural research has proposed various schemes to enhance processors with additional tags to track various properties of a program. Such a technique, which is usually referred to as information flow tracking, has been widely applied to secure software execution (e.g., taint tracking), protect software privacy and improve performance (e.g., control speculation). In this paper, we propose a novel use of information flow tracking to obfuscate the whole control flow of a program with only modest performance degradation, to defeat malicious code injection, discourage software piracy and impede malware analysis. Specifically, we exploit two common features in information flow tracking: the architectural support for automatic propagation of tags and violation handling of tag misuses. Unlike other schemes that use tags as oracles to catch attacks (e.g., taint tracking) or speculation failures, we use the tags as flow-sensitive predicates to hide normal control flow transfers: the tags are used as predicates for control flow transfers to the violation handler, where the real control flow transfer happens. We have implemented a working prototype based on Itanium processors, by leveraging the hardware support for control speculation. Experimental results show that BOSH can obfuscate the whole control flow with only a mean of 26.7% (ranging from 4% to 59%) overhead on SPECINT2006. The increase in code size and compilation time is also modest. Copyright 2009 ACM.
It is found that stable proton acceleration from a thin foil irradiated by a linearly polarized ultraintense laser can be realized for appropriate foil thickness and laser intensity. A dual-peaked electrostatic field,...
详细信息
It is found that stable proton acceleration from a thin foil irradiated by a linearly polarized ultraintense laser can be realized for appropriate foil thickness and laser intensity. A dual-peaked electrostatic field, originating from the oscillating and nonoscillating components of the laser ponderomotive force, is formed around the foil surfaces. This field combines radiation-pressure acceleration and target normal sheath acceleration to produce a single quasimonoenergetic ion bunch. A criterion for this mechanism to be operative is obtained and verified by two-dimensional particle-in-cell simulation. At a laser intensity of ∼5.5×1022 W/cm2, quasimonoenergetic GeV proton bunches are obtained with ∼100 MeV energy spread, less than 4° spatial divergence, and ∼50% energy conversion efficiency from the laser.
暂无评论