The application of memristor in building hardware neural network has accepted widespread interests, and may bring novel opportunities to neural computing. However, due to the limitation of programming precision, the c...
详细信息
The application of memristor in building hardware neural network has accepted widespread interests, and may bring novel opportunities to neural computing. However, due to the limitation of programming precision, the conductance of memristor which represents stored information may deviate from theoretical value, and thus bring error to the neural computing results. In this paper, we analyze the impact of imprecise programming on building hardeware neural network through Monte Carlo simulation on feedback layer model. The results show that the fault-tolerance ability of neural network could well adapt to these errors, which further proves the potential of building neural networks using memristors.
Network coding brings a new solution for IP congestion control, since more than one buffered packets can be encoded together and removed as a coded packet. This may significantly decrease the packet loss during the co...
详细信息
Network coding brings a new solution for IP congestion control, since more than one buffered packets can be encoded together and removed as a coded packet. This may significantly decrease the packet loss during the congestion, but at the cost of building redundant paths. However, how to minimize the overhead of redundant paths turns out to be a NP-hard problem. In this paper, we propose a novel approximation algorithm called FlowGrouping, which transforms the redundant paths building problem into a limited clique partition problem by increasing edge weights, and can find a good approximate solution within O(n 3 ) computation time.
Internet fundamentally changes the model of software development, the demands of software quality, and the process of software resource sharing. Internet- based environment for trustworthy software production is recog...
详细信息
Internet fundamentally changes the model of software development, the demands of software quality, and the process of software resource sharing. Internet- based environment for trustworthy software production is recognized as a key topic of software engineering in both academic and software industry. In this paper, the concepts and models of trustworthy software are introduced which dominate the design of Trustie environment. Trustie provides trustworthy software components sharing by an evolving software repository, and provides collaborative software development in a customizable development platform powered by a software production line framework. Finally the layered practices of research and application based on Trustie preliminarily demonstrate the effectiveness as well as the promising future of this environment.
This paper addresses the issue of error detection in transactional memory, and proposes a new method of error detection based on redundant transaction (EDRT). This method creates a transaction copy for every transacti...
详细信息
This paper addresses the issue of error detection in transactional memory, and proposes a new method of error detection based on redundant transaction (EDRT). This method creates a transaction copy for every transaction, and executes both original transactions and transaction copies on adequate processor cores, and achieves error detection by comparing the execution results. EDRT utilizes the data-versioning mechanism of transactional memory to achieve the acquisition of an approximate minimum error detection comparing data set, and the acquisition is transparent and online. At last, this paper validates the EDRT through 5 test programs, including 4 SPLASH-2 benchmarks. The experimental results show that, the average error detecting cost is about 3.68% relative to the whole program, and it's only about 12.07% relative to the transaction parts of the program.
Many challenges in multi-agent coordination can be modeled as distributed Constraint Optimization Problems (DCOPs). Aiming at DCOPs with low constraint density, this paper proposes a distributed algorithm based on the...
详细信息
Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance...
详细信息
Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance boost. As chip multiprocessors (CMPs) pick up steam as the mainstream processor design choice, how non-uniform distribution of memory accesses across cache sets affects the cache management of CMPs is becoming an open question. We address the question by presenting several cache management schemes on CMP platforms, aiming at balance the memory access distribution across cache sets on shared caches or private caches. We show that on CMP platforms with multi-programmed workloads: (a) for shared caches, the non-uniform memory access distribution across different cache sets is biased by the fact that multiple applications are running concurrently and sharing the cache capacity. The scheme, which we put forward to make use of the non-uniformity to improve performance on shared caches, is proved to be of little to no benefit or even lead to degradation, (b) for caches that are organized as private caches, direct adaption of a scheme that targets this kind of non-uniformity outperforms the baseline private cache design by 2% on average, (c) however, for a private cache based cache management scheme we proposed, further effort to take advantage of this kind of non-uniformity for performance boost (on top of our proposed scheme) is also proved to be of little to no benefit. Therefore, We draw to the conclusion that on CMP platforms with multiprogrammed workloads, the non-uniform distribution of memory accesses across cache sets is partially circumvented by the interactions between multiple applications. Efforts seeking to make use of the non-uniformity to derive more benefit may end up in vain in CMPs.
This paper presents a method that adapting planning description to bring the semantic information into play for service composition through action language C. It shows how service descriptions can be expressed by prec...
详细信息
This paper presents a method that adapting planning description to bring the semantic information into play for service composition through action language C. It shows how service descriptions can be expressed by preconditions and effects and the action language C provides a richer syntax and semantic for complex service descriptions. We also presents the algorithm of Translating semantic Web service described by OWL-S to action language C. Thanks to the structured description and the powerful expression of C, we only consider the initial Situation and the desired goal ignoring details of transition and planning. At last we use satisfiability planning to solve the planning problem by translating the action language into disjunctive logic program.
Although general purpose GPUs have relatively high computing capacity, they also introduce high power consumption compared with general purpose CPUs. Therefore low-power techniques targeted for GPUs will be one of the...
详细信息
ISBN:
(纸本)9781612842080
Although general purpose GPUs have relatively high computing capacity, they also introduce high power consumption compared with general purpose CPUs. Therefore low-power techniques targeted for GPUs will be one of the most hot topics in the future. On the other hand, in several application domains, users are unwilling to sacrifice performance to save power. In this paper, we propose an effective kernel fusion method to reduce the power consumption for GPUs without performance loss. Different from executing multiple kernels serially, the proposed method fuses several kernels into one larger kernel. Owing to the fact that most consecutive kernels in an application have data dependency and could not be fused directly, we split large kernel into multiple slices with strip-mining method, then fuse independent sliced kernels into one kernel. Based on the CUDA programming model, we propose three different kernel fusion implementations, with each one targeting for a special case. Based on the different strip-ming methods, we also propose two fusion mechanisms, which are called invariant-slice fusion and variant-slice fusion. The latter one could be better adapted to the requirements of the kernels to be fused. The experimental results validate that the proposed kernel fusion method could effectively reduce the power consumption for GPU.
This paper addresses the issue of fault recovery in transactional memory, and proposes a method of fault recovery based on parallel recomputing in transactional memory system. This method utilizes the dataversioning m...
详细信息
We consider the greedy scheduling based on the physical model in wireless networks with successive interference cancellation (SIC). There are two major stages in a scheduling scheme, link selection (to decide which li...
详细信息
ISBN:
(纸本)9781612842325
We consider the greedy scheduling based on the physical model in wireless networks with successive interference cancellation (SIC). There are two major stages in a scheduling scheme, link selection (to decide which link is scheduled next) and time slot selection (to deciding which slot is allocated to a given link). Most available schemes take a first-fit policy in the latter and strive to achieve good performance by careful selection of link ordering with respect to interference. Due to the accumulation effect and sequential detection nature of SIC, however, it is difficult to evaluate the interference of a link. As a result, many existing scheduling schemes become less efficient. In this paper, we take a new look on the problem and focus to the time slot selection stage. We define tolerance margin to measure the saturation of a link set and present two heuristic policies: one is to schedule a link to a slot such that the resulting set of links has a maximum tolerance margin;the other is to choose a slot such that the increase of tolerance margin is minimum. Simulation results show that the performance of the proposed schemes is better than the first-fit policy and is close to the optimal solution.
暂无评论