Static program analysis has been widely applied along the whole process of the program development for bug detection, code optimization, testing, etc. Although researchers have made significant work in static program ...
详细信息
Static program analysis has been widely applied along the whole process of the program development for bug detection, code optimization, testing, etc. Although researchers have made significant work in static program analysis, it is still challenging to perform sophisticated interprocedural analysis on large-scale modern software. The underlying reason is that interprocedural analysis for large-scale modern software is highly computation- and memory-intensive, leading to poor efficiency and scalability. In this article, we introduce an efficient distributed and scalable solution for sophisticated static analysis. Specifically, we propose a data-parallel algorithm and a join-process-filter computation model for the CFL-reachability-based interprocedural analysis. Based on that, an efficient distributed static analysis engine called BigSpa is developed, which is composed of an offline batch static program analysis system and an online incremental static program analysis system. The BigSpa system has high generality and can support all kinds of static analysis tasks that can be expressed as CFL reachability problems. The performance of BigSpa is evaluated on real-world large-scale software datasets. Our experiments show that the offline batch system can exceed an order of magnitude compared with the most advanced analysis tools available on performance, and for incremental analysis with small batch updates on the same data sets, the online analysis system can achieve near real-time response, which is very fast and flexible.
The placement of elemental operations (as opposed to data) of a data-driven data-parallel computation in a network of processors is examined. A fast suboptimal algorithm is proposed for such placement which tends to e...
详细信息
The placement of elemental operations (as opposed to data) of a data-driven data-parallel computation in a network of processors is examined. A fast suboptimal algorithm is proposed for such placement which tends to examined. A fast suboptimal algorithm is proposed for such placement which tends to minimise the overall network load when the computation is essentially nonlocal. The cases of grid, torus and hypercube topology are considered. It is shown that the proposed algorithm, while having moderate computational complexity, demonstrates up to a 50% reduction in required network throughput over some straightforward placement schemes in the practical range of network sizes.
computation of the generalised inverse A+ and rank of an arbitrary (including singular and rectangular) matrix A has many applications. This paper derives an iterative scheme to approximate the generalised inverse whi...
详细信息
computation of the generalised inverse A+ and rank of an arbitrary (including singular and rectangular) matrix A has many applications. This paper derives an iterative scheme to approximate the generalised inverse which can be expressed in the form of successive squaring of a composite matrix T. Given an m by n matrix A with m almost-equal-to n, we show that the generalised inverse of A can be computed in parallel time ranging from O(log n) to O(log2 n), similar to previous methods. The rank of matrix A is obtained along with the generalised inverse. The successive matrix squaring algorithm is generalised to higher-order schemes, where the composite matrix is repeatedly raised to an integer power l > 2. This form of expression leads to a simplified notation compared with that of earlier methods, and helps to clarify the relationship between l, the order of the iterative scheme and K, the number of iterations. In particular, the accuracy achieved in approximating A+ is a function only of the magnitude of l(K) and does not depend on the particular values chosen for l and K;we argue that there is no obvious advantage in choosing l other than 2. Our derived error bound for the approximation to A+ is tighter than that previously established. The same bound applies to the rank. Numerical experiments with different test matrices (square, rectangular, complex, singular, etc.) illustrate the method. They further demonstrate that our tighter error bound provides a useful guide to the number of iterations required. In the examples given, the specified accuracy was achieved after the calculated number of iterations, but no earlier. Implementation studies on a general-purpose parallel machine (CM-5) demonstrated a smaller than expected penalty for direct squaring of matrix T in comparison with multiplication of its component block matrices. For special-purpose VLSI architectures, the simple structure of the matrix squaring algorithm leads to a straightforward parallel implementa
Static program analysis is widely used in various application areas to solve many practical problems. Although researchers have made significant achievements in static analysis, it is still too challenging to perform ...
详细信息
ISBN:
(纸本)9781728112466
Static program analysis is widely used in various application areas to solve many practical problems. Although researchers have made significant achievements in static analysis, it is still too challenging to perform sophisticated interprocedural analysis on large-scale modern software. The underlying reason is that interprocedural analysis for large-scale modern software is highly computation- and memory-intensive, leading to poor scalability. We aim to tackle the scalability problem by proposing a novel big data solution for sophisticated static analysis. Specifically, we propose a data-parallel algorithm and a join-process-filter computation model for the CFL-reachability based interprocedural analysis and develop an efficient distributed static analysis engine in the cloud, called BigSpa. Our experiments validated that BigSpa running on a cluster scales greatly to perform precise interprocedural analyses on millions of lines of code, and runs an order of magnitude or more faster than the existing state-of-the-art analysis tools.
The IFDS framework supports interprocedural dataflow analysis with distributive flow functions over finite domains. A large class of interprocedural dataflow analysis problems can be formulated as IFDS problems and th...
详细信息
ISBN:
(纸本)9781665457019
The IFDS framework supports interprocedural dataflow analysis with distributive flow functions over finite domains. A large class of interprocedural dataflow analysis problems can be formulated as IFDS problems and thus can be solved with the IFDS framework precisely. Unfortunately, scaling IFDS analysis to large-scale programs is challenging in terms of both massive memory consumption and low analysis efficiency. This paper presents DStream, a scalable system dedicated to precise and highly parallel IFDS analysis for large-scale programs. DStream leverages a streaming-based out-of-core computation model to reduce memory footprint significantly and adopts fine-grained dataparallelism to achieve efficiency. We implemented a taint analysis as a DStream instance analysis and compared DStream with three state-of-the-art tools. Our experiments validate that DStream outperforms all other tools with average speedups from 4.37x to 14.46x on a commodity PC with limited available memory. Meanwhile, the experiments confirm that DStream successfully scales to large-scale programs which the state-of-the-art tools (e.g., FlowDroid and/or DiskDroid) fail to analyze.
Programming productivity very much depends on the availability of basic building blocks that can be reused for a wide range of application scenarios and the ability to define rich abstraction hierarchies. Driven by th...
详细信息
Programming productivity very much depends on the availability of basic building blocks that can be reused for a wide range of application scenarios and the ability to define rich abstraction hierarchies. Driven by the aim for increased reuse, such basic building blocks tend to become more and more generic in their specification;structural as well as behavioural properties are turned into parameters that are passed on to lower layers of abstraction where eventually a differentiation is being made. In the context of array programming, such properties are typically array ranks (number of axes/dimensions) and array shapes (number of elements along each axis/dimension). This allows for abstract definitions of operations such as element-wise additions, concatenations, rotations, and so on, which jointly enable a very high-level compositional style of programming, similar to, for instance, MATLAB. However, such a generic programming style generally comes at a price in terms of runtime overheads when compared against tailor-made low-level implementations. Additional layers of abstraction as well as the lack of hard-coded structural properties often inhibits optimisations that are obvious otherwise. Although complex static compiler analyses and transformations such as partial evaluations can ameliorate the situation to quite some extent, there are cases, where the required level of information is not available until runtime. In this paper, we propose to shift part of the optimisation process into the runtime of applications. Triggered by some runtime observation, the compiler asynchronously applies partial evaluation techniques to frequently used program parts and dynamically replaces initial program fragments by more specialised ones through dynamic re-linking. In contrast to many existing approaches, we suggest this optimisation to be done in a rather non-intrusive, decoupled way. We use a full-fledged compiler that is run on a separate core. This measure enables us to ru
暂无评论