This paper addresses the issue of fault recovery in transactional memory, and proposes a method of fault recovery based on parallel recomputing in transactional memory system. This method utilizes the dataversioning m...
详细信息
Although general purpose GPUs have relatively high computing capacity, they also introduce high power consumption compared with general purpose CPUs. Therefore low-power techniques targeted for GPUs will be one of the...
详细信息
ISBN:
(纸本)9781612842080
Although general purpose GPUs have relatively high computing capacity, they also introduce high power consumption compared with general purpose CPUs. Therefore low-power techniques targeted for GPUs will be one of the most hot topics in the future. On the other hand, in several application domains, users are unwilling to sacrifice performance to save power. In this paper, we propose an effective kernel fusion method to reduce the power consumption for GPUs without performance loss. Different from executing multiple kernels serially, the proposed method fuses several kernels into one larger kernel. Owing to the fact that most consecutive kernels in an application have data dependency and could not be fused directly, we split large kernel into multiple slices with strip-mining method, then fuse independent sliced kernels into one kernel. Based on the CUDA programming model, we propose three different kernel fusion implementations, with each one targeting for a special case. Based on the different strip-ming methods, we also propose two fusion mechanisms, which are called invariant-slice fusion and variant-slice fusion. The latter one could be better adapted to the requirements of the kernels to be fused. The experimental results validate that the proposed kernel fusion method could effectively reduce the power consumption for GPU.
The development of multi-core processor makes the parallelization of traditional sequential algorithms increasingly important. Meanwhile, transactional memory serves a good parallel programming model. This paper takes...
详细信息
The development of multi-core processor makes the parallelization of traditional sequential algorithms increasingly important. Meanwhile, transactional memory serves a good parallel programming model. This paper takes the advantage of software transactional memory to parallelize the Multi-Exit Asymmetric Adaboost algorithm for face detection. The parallel version is evaluated on three different implementations of software transactional memory. The experiment results show that the transactional memory based parallelization outperforms the traditional lock based approach. A speedup of nearly seven is achieved on a eight-core machine on an eight-core system.
A Cloud may be seen as a type of flexible computing infrastructure consisting of many compute nodes, where resizable computing capacities can be provided to different customers. To fully harness the power of the Cloud...
详细信息
A Cloud may be seen as a type of flexible computing infrastructure consisting of many compute nodes, where resizable computing capacities can be provided to different customers. To fully harness the power of the Cloud, efficient data management is needed to handle huge volumes of data and support a large number of concurrent end users. To achieve that, a scalable and high-throughput indexing scheme is generally required. Such an indexing scheme must support parallel search to improve scalability. In this paper, we present a bitmap based indexing scheme for efficient data processing in the Cloud. Our approach can be summarized as follows. First, we build a local bitmap index for each compute node which only indexes data residing on the node. Second, we organize the compute nodes as a structured overlay and each node maintains a portion of the global index for the whole different data. The global index is also bitmap index to indicate the node each data resides in. Third, all bitmaps are compressed by adopting run-length coding for reducing storage requirement. We conduct extensive experiments on a LAN, and the results demonstrate that our indexing scheme is dynamic, efficient and scalable.
As the energy consumption of embedded multiprocessor systems becomes increasingly prominent, the real-time energy-efficient scheduling in multiprocessor systems becomes an urgent problem to reduce the system energy co...
详细信息
Trust systems provide a promising way to build trust relationships among users in distributed and opening systems. However, it is difficult to make quantitatively comparative analysis on different trust systems becaus...
详细信息
Trust systems provide a promising way to build trust relationships among users in distributed and opening systems. However, it is difficult to make quantitatively comparative analysis on different trust systems because of the different application settings and the lack of effective measures. This paper constructs a framework of trust systems in terms of linear algebra, which helps us model and implement different systems in a uniform way. Besides, we propose an ordering-based approach to evaluating trust systems, then give two relevant ordering-base measures. The experiment results suggests that our method provides an effective way to analyze and evaluate trust systems.
The Quiet DDoS attack becomes one of the most severely threat to the network safety, because this kind of attack completely adopts legal TCP flow while distributing its destination IP to evade various countermeasu...
详细信息
The Quiet DDoS attack becomes one of the most severely threat to the network safety, because this kind of attack completely adopts legal TCP flow while distributing its destination IP to evade various countermeasures deployed in the network. However, the high distributed degree of the destination IP becomes one characteristics of the attack. However, we think this characteristic make partially of the attack flow not match the behavior habit of network users. Inspired by this viewpoint, we propose a novel method to counter the Quiet DDoS attack based on the NBHU (network behavior habit of users). Furthermore, we carry on simulation of our method using NS2 platform, and the results show that this method can reduce the attack performance.
Cloud Services Delivery Networks (CSDN) constructs a layer distributed server overlay over the Internet, which uses the way to the nearest and on-demand approach providing services to end users. Facing the scale and d...
详细信息
Cloud Services Delivery Networks (CSDN) constructs a layer distributed server overlay over the Internet, which uses the way to the nearest and on-demand approach providing services to end users. Facing the scale and diversification of the resource demand characteristics of the Internet cloud services, CSDN forms different logical sub-server overlay for different kinds of cloud services. However, most servers and bandwidth resources of CSDN are used to deliver the streaming and downloading kind of cloud services, and the dynamic allocation of their delivery resource is the main research emphasis in this paper. This paper first models the problem to be a multi-dimensional facility location problem, according to the two characteristics: the memory resource and bandwidth resource of this kind of application are the bottleneck resource;the hot contents of this kind of application can be delivered using the Peer-to-Peer mechanisms. After the model analyzed and its NP-Complete proved, we then propose a heuristic algorithm. Finally, using the service delivery cost savings as the performance metrics, while the actual system's operation trace is as the input, the effectiveness of the algorithm are comprehensively assessed.
This paper presents a method that adapting planning description to bring the semantic information into play for service composition through action language C. It shows how service descriptions can be expressed by prec...
详细信息
This paper presents a method that adapting planning description to bring the semantic information into play for service composition through action language C. It shows how service descriptions can be expressed by preconditions and effects and the action language C provides a richer syntax and semantic for complex service descriptions. We also presents the algorithm of Translating semantic Web service described by OWL-S to action language C. Thanks to the structured description and the powerful expression of C, we only consider the initial Situation and the desired goal ignoring details of transition and planning. At last we use satisfiability planning to solve the planning problem by translating the action language into disjunctive logic program.
Strongly promoted by the leading industrial companies, cloud computing becomes increasingly popular in re-cent years. The growth rate of cloud computing surpasses even the most optimistic predictions. A cloud applicat...
详细信息
Strongly promoted by the leading industrial companies, cloud computing becomes increasingly popular in re-cent years. The growth rate of cloud computing surpasses even the most optimistic predictions. A cloud application is a large-scale distributed system that consist a lot of distributed cloud nodes. How to make optimal deployment of cloud applications is a challenging research problem. When deploying a cloud application to the cloud environment, cloud node ranking is one of the most important approaches for selecting optimal cloud nodes for the cloud application. Traditional ranking methods usually rank the cloud nodes based on their QoS values, without considering the communication performance between cloud nodes. However, such kind of node relationship is very important for the communication-intensive cloud applications (e.g., Message Passing Interface (MPI) programs), which have a lot of communications between the selected cloud nodes. In this paper, we propose a novel clustering-based method for selecting optimal cloud nodes for deploying communication-intensive applications to the cloud environment. Our method not only takes into account the cloud node qualities, but also the communication performance between different nodes. We deploy several well-known MPI programs on a real-world cloud and compare our method with other methods. The experimental results show the effectiveness of our cluster-based method.
暂无评论