This paper addresses the issue of fault recovery in transactional memory,and proposes a method of fault recovery based on parallel recomputing in transactional memory *** method utilizes the dataversioning mechanism o...
详细信息
This paper addresses the issue of fault recovery in transactional memory,and proposes a method of fault recovery based on parallel recomputing in transactional memory *** method utilizes the dataversioning mechanism of transactional memory system to avoid the extra cost of state saving,rolls back a single transaction to avoid wasting the computing time of the fault-free transactions,and adopts the parallel recomputing method to reduce the cost of fault *** paper applies this method to Open TM programs,and proposes the implementation method of parallel recomputing in Open *** last,this paper tests the performance of this method through a test *** experimental results show that,compared with the fault recovery method of rolling back a single transaction,the parallel recomputing method in transactional memory system can execute the fault recovery quickly and accurately and the method has a well scalability.
It is an important issue to preserve the consistent delivering order of messages at each site in distributed Virtual Environments(DVEs). Currently, the violations of message delivery order are inclined to happen in th...
详细信息
Due to the large message transmission latency in distributed Virtual Environments(DVEs) on Wide Area Net-work(WAN), the effectiveness of causality consistency control of message ordering is determined by not only caus...
详细信息
The application of memristor in building hardware neural network has accepted widespread interests, and may bring novel opportunities to neural computing. However, due to the limitation of programming precision, the c...
详细信息
The application of memristor in building hardware neural network has accepted widespread interests, and may bring novel opportunities to neural computing. However, due to the limitation of programming precision, the conductance of memristor which represents stored information may deviate from theoretical value, and thus bring error to the neural computing results. In this paper, we analyze the impact of imprecise programming on building hardeware neural network through Monte Carlo simulation on feedback layer model. The results show that the fault-tolerance ability of neural network could well adapt to these errors, which further proves the potential of building neural networks using memristors.
This paper addresses the issue of error detection in transactional memory, and proposes a new method of error detection based on redundant transaction (EDRT). This method creates a transaction copy for every transacti...
详细信息
This paper addresses the issue of error detection in transactional memory, and proposes a new method of error detection based on redundant transaction (EDRT). This method creates a transaction copy for every transaction, and executes both original transactions and transaction copies on adequate processor cores, and achieves error detection by comparing the execution results. EDRT utilizes the data-versioning mechanism of transactional memory to achieve the acquisition of an approximate minimum error detection comparing data set, and the acquisition is transparent and online. At last, this paper validates the EDRT through 5 test programs, including 4 SPLASH-2 benchmarks. The experimental results show that, the average error detecting cost is about 3.68% relative to the whole program, and it's only about 12.07% relative to the transaction parts of the program.
Internet fundamentally changes the model of software development, the demands of software quality, and the process of software resource sharing. Internet- based environment for trustworthy software production is recog...
详细信息
Internet fundamentally changes the model of software development, the demands of software quality, and the process of software resource sharing. Internet- based environment for trustworthy software production is recognized as a key topic of software engineering in both academic and software industry. In this paper, the concepts and models of trustworthy software are introduced which dominate the design of Trustie environment. Trustie provides trustworthy software components sharing by an evolving software repository, and provides collaborative software development in a customizable development platform powered by a software production line framework. Finally the layered practices of research and application based on Trustie preliminarily demonstrate the effectiveness as well as the promising future of this environment.
Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance...
详细信息
Non-uniform distribution of memory accesses across cache sets has been recognized as one of the sources of inefficiency of cache architecture on single-core platform. Several schemes target the problem for performance boost. As chip multiprocessors (CMPs) pick up steam as the mainstream processor design choice, how non-uniform distribution of memory accesses across cache sets affects the cache management of CMPs is becoming an open question. We address the question by presenting several cache management schemes on CMP platforms, aiming at balance the memory access distribution across cache sets on shared caches or private caches. We show that on CMP platforms with multi-programmed workloads: (a) for shared caches, the non-uniform memory access distribution across different cache sets is biased by the fact that multiple applications are running concurrently and sharing the cache capacity. The scheme, which we put forward to make use of the non-uniformity to improve performance on shared caches, is proved to be of little to no benefit or even lead to degradation, (b) for caches that are organized as private caches, direct adaption of a scheme that targets this kind of non-uniformity outperforms the baseline private cache design by 2% on average, (c) however, for a private cache based cache management scheme we proposed, further effort to take advantage of this kind of non-uniformity for performance boost (on top of our proposed scheme) is also proved to be of little to no benefit. Therefore, We draw to the conclusion that on CMP platforms with multiprogrammed workloads, the non-uniform distribution of memory accesses across cache sets is partially circumvented by the interactions between multiple applications. Efforts seeking to make use of the non-uniformity to derive more benefit may end up in vain in CMPs.
The reliability issue of Exascale system is extremely serious. Traditional passive fault-tolerant methods, such as rollback-recovery, can not fully guarantee system reliability any more because of their large executin...
详细信息
The reliability issue of Exascale system is extremely serious. Traditional passive fault-tolerant methods, such as rollback-recovery, can not fully guarantee system reliability any more because of their large executing overhead and long recovering duration. Active fault tolerance is expected to become another important fault-tolerant approach for Exascale system. Focusing on system failure prediction, which is one key step of active fault tolerance, we construct online failure prediction model and research on the effective method of system status pretreatment. In order to improve the accuracy and real-time feature of current methods, the proposed Improved Adaptive Semantic Filter (IASF) method processes the latest system logs regularly, filtering useless information out of them according to their semantics. Adopting the main idea of Vector Space Model (VSM), IASF method creates Event Vector corresponding to each log record. By calculating the cosine of vectorial angle, it evaluates the semantics correlation between different log records, and then executes temporal and spatial redundant filter considering the burst feature of log records. IASF method is insensitive to the type of system log and does not introduce any expert system or domain knowledge. The experiment result shows that system can eliminate about 99.6% of useless log records after executing IASF method.
Cloud needs to have rapid and elastic resources supply capability, because of the fluctuant resources demand of end-users. Multi-scale resources elastic binding is an important method to provide cloud services with ra...
详细信息
Cloud needs to have rapid and elastic resources supply capability, because of the fluctuant resources demand of end-users. Multi-scale resources elastic binding is an important method to provide cloud services with rapid and elastic service capability. The most challenging problem in multi-scale resources elastic binding is how to predict the dynamic resource demand of end-users, and then decide when and to what extent multi-scale resources need elastic binding based on the prediction. In this paper, we present the prediction model based on RBF (Radial Basis Function) Network, which is used to predict end-users resource demand in advance. Compared with current prediction methods, it has faster prediction speed and higher prediction accuracy. Then we use traces data (the bandwidth demand of Web type of cloud services) collected from a real-world cloud provider: ChinaCache, as the training and testing data set to validate the method. Finally, we evaluate the predicted results using general prediction accuracy metrics. The results prove that the prediction model based on RBF network is able to resolve the decision problem in multi-scale resources elastic binding.
暂无评论