We present a demand-driven approach to memory leak detection algorithm based on flow- and context-sensitive pointer analysis. The detection algorithm firstly assumes the presence of a memory leak at some program point...
详细信息
We present a demand-driven approach to memory leak detection algorithm based on flow- and context-sensitive pointer analysis. The detection algorithm firstly assumes the presence of a memory leak at some program point and then runs a backward analysis to see if this assumption can be disproved. Our algorithm computes the memory abstraction of programs based on points-to graph resulting from flow- and context-sensitive pointer analysis. We have implemented the algorithm in the SUIF2 compiler infrastructure and used the implementation to analyze a set of C benchmark programs. The experimental results show that the approach has better precision with satisfied scalability as expected.
Internet-based virtual computing environment (iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services. Virtual machines (...
详细信息
Internet-based virtual computing environment (iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services. Virtual machines (VMs) have been widely used in iVCE to isolate different users/jobs and ensure trustworthiness, but traditionally VMs require a long period of time for booting, which cannot meet the requirement of iVCE's large-scale and highly dynamic applications. To address this problem, in this paper we design and implement VirtMan, a fast booting system for a large number of virtual machines in iVCE. VirtMan uses the Linux Small Computer System Interface (SCSI) target to remotely mount to the source image in a scalable hierarchy, and leverages the homogeneity of a set of VMs to transfer only necessary image data at runtime. We have implemented VirtMan both as a standalone system and for OpenStack. In our 100-server testbed, VirtMan boots up 1000 VMs (with a 15 CB image of Windows Server 2008) on 100 physical servers in less than 120 s, which is three orders of magnitude lower than current public clouds.
The widening gap between processor and memory speeds makes cache an important issue in the computer system design. Compared with work set of programs, cache resource is often rare. Therefore, it is very important for ...
详细信息
The widening gap between processor and memory speeds makes cache an important issue in the computer system design. Compared with work set of programs, cache resource is often rare. Therefore, it is very important for a computer system to use cache efficiently. Toward a dynamically reconfigurable cache proposed recently, DOOC (Data- Object Oriented Cache), this paper proposes a quantitative framework for analyzing the cache requirement of data-objects, which includes cache capacity, block size, associativity and coherence protocol. And a kind of graph coloring algorithm dealing with the competition between data-objects in the DOOC is proposed as well. Finally, we apply our approaches to the compiler management of DOOC. We test our approaches on both a single-core platform and a four-core platform. Compared with the traditional caches, the DOOC in both platforms achieves an average reduction of 44.98% and 49.69% in miss rate respectively. And its performance is very close to the ideal optimal cache.
Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at t...
详细信息
Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.
The pull-based development model, widely used in distributed software teams on open source communities, can efficiently gather the wisdom from crowds. Instead of sharing access to a central repository,contributors cre...
详细信息
The pull-based development model, widely used in distributed software teams on open source communities, can efficiently gather the wisdom from crowds. Instead of sharing access to a central repository,contributors create a fork, update it locally, and request to have their changes merged back, i.e., submit a pull-request. On the one hand, this model lowers the barrier to entry for potential contributors since anyone can submit pull-requests to any repository, but on the other hand it also increases the burden on integrators, who are responsible for assessing the proposed patches and integrating the suitable changes into the central repository. The role of integrators in pull-based development is crucial. They must not only ensure that pull-requests should meet the project’s quality standards before being accepted, but also finish the evaluations in a timely manner. To keep up with the volume of incoming pull-requests, continuous integration(CI) is widely adopted to automatically build and test every pull-request at the time of submission. CI provides extra evidences relating to the quality of pull-requests, which would help integrators to make final decision(i.e., accept or reject). In this paper, we present a quantitative study that tries to discover which factors affect the process of pull-based development model, including acceptance and latency in the context of CI. Using regression modeling on data extracted from a sample of Git Hub projects deploying the Travis-CI service, we find that the evaluation process is a complex issue, requiring many independent variables to explain adequately. In particular, CI is a dominant factor for the process, which not only has a great influence on the evaluation process per se, but also changes the effects of some traditional predictors.
It is widely believed that Shor's factoring algorithm provides a driving force to boost the quantum computing ***, a serious obstacle to its binary implementation is the large number of quantum gates. Non-binary quan...
详细信息
It is widely believed that Shor's factoring algorithm provides a driving force to boost the quantum computing ***, a serious obstacle to its binary implementation is the large number of quantum gates. Non-binary quantum computing is an efficient way to reduce the required number of elemental gates. Here, we propose optimization schemes for Shor's algorithm implementation and take a ternary version for factorizing 21 as an example. The optimized factorization is achieved by a two-qutrit quantum circuit, which consists of only two single qutrit gates and one ternary controlled-NOT gate. This two-qutrit quantum circuit is then encoded into the nine lower vibrational states of an ion trapped in a weakly anharmonic potential. Optimal control theory(OCT) is employed to derive the manipulation electric field for transferring the encoded states. The ternary Shor's algorithm can be implemented in one single step. Numerical simulation results show that the accuracy of the state transformations is about 0.9919.
As a solution to growing global wire delay, non-uniform cache architecture (NUCA) has already been a trend in large cache designs. The access time of NUCA is determined by the distance between the cache bank containin...
详细信息
distributed software systems are becoming more and more complex *** is easy to find a huge amount of computing nodes in a nationwide or global information *** example,We Chat(Wei Xin),a well-known mobile application i...
详细信息
distributed software systems are becoming more and more complex *** is easy to find a huge amount of computing nodes in a nationwide or global information *** example,We Chat(Wei Xin),a well-known mobile application in China,has reached a record of 650 million monthly active users in the third quarter of *** the same time,researchers are starting to talk about software systems which have billions of lines of codes[1]or can last one hundred years.
Internet-scale open source software (OSS) pro- duction in various communities generates abundant reusable resources for software developers. However, finding the de- sired and mature software with keyword queries fr...
详细信息
Internet-scale open source software (OSS) pro- duction in various communities generates abundant reusable resources for software developers. However, finding the de- sired and mature software with keyword queries from a considerable number of candidates, especially for the fresher, is a significant challenge because current search services often fail to understand the semantics of user queries. In this paper, we construct a software term database (STDB) by analyzing tagging data in Stack Overflow and propose a correlationbased software search (CBSS) approach that performs correlation retrieval based on the term relevance obtained from STDB. In addition, we design a novel ranking method to optimize the initial retrieval result. We explore four research questions in four experiments, respectively, to evaluate the effectiveness of the STDB and investigate the performance of the CBSS. The experiment results show that the proposed CBSS can effectively respond to keyword-based software searches and significantly outperforms other existing search services at finding mature software.
Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28...
详细信息
Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28-nm technology and 0.2-lm technology to analyze the impact of strike location on SET sensitivity in FDSOI devices. Simulation results show that the most SET-sensitive region in FDSOI transistors is the drain region near the gate. An in-depth analysis shows that the bipolar amplification effect in FDSOI devices is dependent on the strike locations. In addition, when the drain contact is moved toward the drain direction, the most sensitive region drifts toward the drain and collects more charge. This provides theoretical guidance for SET hardening.
暂无评论