As the burst increasing of created and demand on information and data, the efficient solution on storage management is highly required in the cloud storage systems. As an important component of management, storage all...
详细信息
As the burst increasing of created and demand on information and data, the efficient solution on storage management is highly required in the cloud storage systems. As an important component of management, storage allocation scheme aims to use a low redundancy and also to achieve a high reliability. However, the two aims are hard to be unified. Considering the practical situation of Cloud systems, we propose a systematic storage allocation scheme to touch them both. And we also study the impact of many factors to the data reliability.
The development of multi-core processor makes the parallelization of traditional sequential algorithms increasingly important. Meanwhile, transactional memory serves a good parallel programming model. This paper takes...
详细信息
The development of multi-core processor makes the parallelization of traditional sequential algorithms increasingly important. Meanwhile, transactional memory serves a good parallel programming model. This paper takes the advantage of software transactional memory to parallelize the Multi-Exit Asymmetric Adaboost algorithm for face detection. The parallel version is evaluated on three different implementations of software transactional memory. The experiment results show that the transactional memory based parallelization outperforms the traditional lock based approach. A speedup of nearly seven is achieved on a eight-core machine on an eight-core system.
The application of memristor in building hardware neural network has accepted widespread interests, and may bring novel opportunities to neural computing. However, due to the limitation of programming precision, the c...
详细信息
The application of memristor in building hardware neural network has accepted widespread interests, and may bring novel opportunities to neural computing. However, due to the limitation of programming precision, the conductance of memristor which represents stored information may deviate from theoretical value, and thus bring error to the neural computing results. In this paper, we analyze the impact of imprecise programming on building hardeware neural network through Monte Carlo simulation on feedback layer model. The results show that the fault-tolerance ability of neural network could well adapt to these errors, which further proves the potential of building neural networks using memristors.
Network coding brings a new solution for IP congestion control, since more than one buffered packets can be encoded together and removed as a coded packet. This may significantly decrease the packet loss during the co...
详细信息
Network coding brings a new solution for IP congestion control, since more than one buffered packets can be encoded together and removed as a coded packet. This may significantly decrease the packet loss during the congestion, but at the cost of building redundant paths. However, how to minimize the overhead of redundant paths turns out to be a NP-hard problem. In this paper, we propose a novel approximation algorithm called FlowGrouping, which transforms the redundant paths building problem into a limited clique partition problem by increasing edge weights, and can find a good approximate solution within O(n 3 ) computation time.
Internet fundamentally changes the model of software development, the demands of software quality, and the process of software resource sharing. Internet- based environment for trustworthy software production is recog...
详细信息
Internet fundamentally changes the model of software development, the demands of software quality, and the process of software resource sharing. Internet- based environment for trustworthy software production is recognized as a key topic of software engineering in both academic and software industry. In this paper, the concepts and models of trustworthy software are introduced which dominate the design of Trustie environment. Trustie provides trustworthy software components sharing by an evolving software repository, and provides collaborative software development in a customizable development platform powered by a software production line framework. Finally the layered practices of research and application based on Trustie preliminarily demonstrate the effectiveness as well as the promising future of this environment.
This paper addresses the issue of error detection in transactional memory, and proposes a new method of error detection based on redundant transaction (EDRT). This method creates a transaction copy for every transacti...
详细信息
This paper addresses the issue of error detection in transactional memory, and proposes a new method of error detection based on redundant transaction (EDRT). This method creates a transaction copy for every transaction, and executes both original transactions and transaction copies on adequate processor cores, and achieves error detection by comparing the execution results. EDRT utilizes the data-versioning mechanism of transactional memory to achieve the acquisition of an approximate minimum error detection comparing data set, and the acquisition is transparent and online. At last, this paper validates the EDRT through 5 test programs, including 4 SPLASH-2 benchmarks. The experimental results show that, the average error detecting cost is about 3.68% relative to the whole program, and it's only about 12.07% relative to the transaction parts of the program.
Many challenges in multi-agent coordination can be modeled as distributed Constraint Optimization Problems (DCOPs). Aiming at DCOPs with low constraint density, this paper proposes a distributed algorithm based on the...
详细信息
Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance...
详细信息
Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance boost. As chip multiprocessors (CMPs) pick up steam as the mainstream processor design choice, how non-uniform distribution of memory accesses across cache sets affects the cache management of CMPs is becoming an open question. We address the question by presenting several cache management schemes on CMP platforms, aiming at balance the memory access distribution across cache sets on shared caches or private caches. We show that on CMP platforms with multi-programmed workloads: (a) for shared caches, the non-uniform memory access distribution across different cache sets is biased by the fact that multiple applications are running concurrently and sharing the cache capacity. The scheme, which we put forward to make use of the non-uniformity to improve performance on shared caches, is proved to be of little to no benefit or even lead to degradation, (b) for caches that are organized as private caches, direct adaption of a scheme that targets this kind of non-uniformity outperforms the baseline private cache design by 2% on average, (c) however, for a private cache based cache management scheme we proposed, further effort to take advantage of this kind of non-uniformity for performance boost (on top of our proposed scheme) is also proved to be of little to no benefit. Therefore, We draw to the conclusion that on CMP platforms with multiprogrammed workloads, the non-uniform distribution of memory accesses across cache sets is partially circumvented by the interactions between multiple applications. Efforts seeking to make use of the non-uniformity to derive more benefit may end up in vain in CMPs.
This paper presents a method that adapting planning description to bring the semantic information into play for service composition through action language C. It shows how service descriptions can be expressed by prec...
详细信息
This paper presents a method that adapting planning description to bring the semantic information into play for service composition through action language C. It shows how service descriptions can be expressed by preconditions and effects and the action language C provides a richer syntax and semantic for complex service descriptions. We also presents the algorithm of Translating semantic Web service described by OWL-S to action language C. Thanks to the structured description and the powerful expression of C, we only consider the initial Situation and the desired goal ignoring details of transition and planning. At last we use satisfiability planning to solve the planning problem by translating the action language into disjunctive logic program.
Although general purpose GPUs have relatively high computing capacity, they also introduce high power consumption compared with general purpose CPUs. Therefore low-power techniques targeted for GPUs will be one of the...
详细信息
ISBN:
(纸本)9781612842080
Although general purpose GPUs have relatively high computing capacity, they also introduce high power consumption compared with general purpose CPUs. Therefore low-power techniques targeted for GPUs will be one of the most hot topics in the future. On the other hand, in several application domains, users are unwilling to sacrifice performance to save power. In this paper, we propose an effective kernel fusion method to reduce the power consumption for GPUs without performance loss. Different from executing multiple kernels serially, the proposed method fuses several kernels into one larger kernel. Owing to the fact that most consecutive kernels in an application have data dependency and could not be fused directly, we split large kernel into multiple slices with strip-mining method, then fuse independent sliced kernels into one kernel. Based on the CUDA programming model, we propose three different kernel fusion implementations, with each one targeting for a special case. Based on the different strip-ming methods, we also propose two fusion mechanisms, which are called invariant-slice fusion and variant-slice fusion. The latter one could be better adapted to the requirements of the kernels to be fused. The experimental results validate that the proposed kernel fusion method could effectively reduce the power consumption for GPU.
暂无评论