DevOps is an emerging approach that aims at the symbiosis of development, quality assurance and operations. Developers need feedback from the test executions that Continuous Integration servers support. On the other h...
详细信息
Uninitialized variables can cause system crashes when used and security vulnerabilities when exploited. With source rather than binary instrumentation, dynamic analysis tools such as MSan can detect uninitialized memo...
详细信息
ISBN:
(纸本)9781450326704
Uninitialized variables can cause system crashes when used and security vulnerabilities when exploited. With source rather than binary instrumentation, dynamic analysis tools such as MSan can detect uninitialized memory uses at significantly reduced overhead but are still *** this paper, we introduce a static value-flow analysis, called Usher, to guide and accelerate the dynamic analysis performed by such tools. Usher reasons about the definedness of values using a value-flow graph (VFG) that captures def-use chains for both top-level and address-taken variables interprocedurally and removes unnecessary instrumentation by solving a graph reachability problem. Usher works well with any pointer analysis (done a priori) and facilitates advanced instrumentation-reducing optimizations (with two demonstrated here). Implemented in LLVM and evaluated using all the 15 SPEC2000 C programs, Usher can reduce the slowdown of MSan from 212% -- 302% to 123% -- 140% for a number of configurations tested.
To exploit the full potential of GPGPUs for general purpose computing, DOACR parallelism abundant in scientific and engineering applications must be harnessed. However, the presence of cross-iteration data dependences...
详细信息
To exploit the full potential of GPGPUs for general purpose computing, DOACR parallelism abundant in scientific and engineering applications must be harnessed. However, the presence of cross-iteration data dependences in DOACR loops poses an obstacle to execute their computations concurrently using a massive number of fine-grained threads. This work focuses on iterative PDE solvers rich in DOACR parallelism to identify optimization principles and strategies that allow their efficient mapping to GPGPUs. Our main finding is that certain DOACR loops can be accelerated further on GPGPUs if they are algorithmically restructured (by a domain expert) to be more amendable to GPGPU parallelization, judiciously optimized (by the compiler), and carefully tuned by a performance-tuning tool. We substantiate this finding with a case study by presenting a new parallel SSOR method that admits more efficient data-parallel SIMD execution than red-black SOR on GPGPUs. Our solution is obtained non-conventionally, by starting from a K-layer SSOR method and then parallelizing it by applying a non-dependence-preserving scheme consisting of a new domain decomposition technique followed by a generalized loop tiling. Despite its relatively slower convergence, our new method outperforms red-black SOR by making a better balance between data reuse and parallelism and by trading off convergence rate for SIMD parallelism. Our experimental results highlight the importance of synergy between domain experts, compiler optimizations and performance tuning in maximizing the performance of applications, particularly PDE-based DOACR loops, on GPGPUs.
Data tiling is an array layout transformation technique that partitions an array into smaller subarray blocks. It was originally proposed to improve the cache performance of regular loops. Recently, researchers have a...
详细信息
Evaluating the Torontonian function is a central computational challenge in the simulation of Gaussian Boson Sampling (GBS) with threshold detection. In this work, we propose a recursive algorithm providing a polynomi...
详细信息
Unitary designs are essential tools in several quantum information protocols. Similarly to other design concepts, unitary designs are mainly used to facilitate averaging over a relevant space, in this case, the unitar...
详细信息
We investigate the extendibility problem for Brauer states, focusing on the symmetric two-sided extendibility and the de Finetti extendibility. By employing the representation theory of the unitary and orthogonal grou...
详细信息
We introduce the Piquasso quantum programming framework, a full-stack open-source software platform for the simulation and programming of photonic quantum computers. Piquasso can be programmed via a high-level Python ...
详细信息
暂无评论