Hardware support for isolated execution (such as Intel SGX) enables development of applications that keep their code and data confidential even while running on a hostile or compromised host. However, automatically ve...
详细信息
ISBN:
(纸本)9781450342612
Hardware support for isolated execution (such as Intel SGX) enables development of applications that keep their code and data confidential even while running on a hostile or compromised host. However, automatically verifying that such applications satisfy confidentiality remains challenging. We present a methodology for designing such applications in a way that enables certifying their confidentiality. Our methodology consists of forcing the application to communicate with the external world through a narrow interface, compiling it with runtime checks that aid verification, and linking it with a small runtime library that implements the interface. The runtime library includes core services such as secure communication channels and memory management. We formalize this restriction on the application as Information Release Confinement (IRC), and we show that it allows us to decompose the task of proving confidentiality into (a) one-time, human-assisted functional verification of the runtime to ensure that it does not leak secrets, (b) automatic verification of the application's machine code to ensure that it satisfies IRC and does not directly read or corrupt the runtime's internal state. We present/CONFIDENTIAL: a verifier for IRC that is modular, automatic, and keeps our compiler out of the trusted computing base. Our evaluation suggests that the methodology scales to real-world applications.
Dynamic memory allocators (malloc/free) rely on mutual exclusion locks for protecting the consistency of their shared data structures under multithreading. The use of locking has many disadvantages with respect to per...
详细信息
Dynamic memory allocators (malloc/free) rely on mutual exclusion locks for protecting the consistency of their shared data structures under multithreading. The use of locking has many disadvantages with respect to performance, availability, robustness, and programming flexibility. A lock-free memory allocator guarantees progress regardless of whether some threads are delayed or even killed and regardless of scheduling policies. This paper presents a completely lock-free memory allocator. It uses only widely-available operating system support and hardware atomic instructions. It offers guaranteed availability even under arbitrary thread termination and crash-failure, and it is immune to dead-lock regardless of scheduling policies, and hence it can be used even in interrupt handlers and real-time applications without requiring special scheduler support. Also, by leveraging some high-level structures from Hoard, our allocator is highly scalable, limits space blowup to a constant factor, and is capable of avoiding false sharing. In addition, our allocator allows finer concurrency and much lower latency than Hoard. We use PowerPC shared memory multiprocessor systems to compare the performance of our allocator with the default AIX 5.1 libc malloc, and two widely-used multithread allocators, Hoard and Ptmalloc. Our allocator outperforms the other allocators in virtually all cases and often by substantial margins, under various levels of parallelism and allocation patterns. Furthermore, our allocator also offers the lowest contention-free latency among the allocators by significant margins.
Arrays are such a rich and fundamental data type that they tend to be built into a language, either in the compiler or in a large lowlevel library. Defining this functionality at the user level instead provides greate...
详细信息
Static analysis of programs in weakly typed languages such as C and C++ is generally not sound because of possible memory errors due to dangling pointer references, uninitialized pointers, and array bounds overflow. W...
详细信息
Chimera(1) uses a new hybrid program analysis to provide deterministic replay for commodity multiprocessor systems. Chimera leverages the insight that it is easy to provide deterministic multiprocessor replay for data...
详细信息
ISBN:
(纸本)9781450312059
Chimera(1) uses a new hybrid program analysis to provide deterministic replay for commodity multiprocessor systems. Chimera leverages the insight that it is easy to provide deterministic multiprocessor replay for data-race-free programs (one can just record non-deterministic inputs and the order of synchronization operations), so if we can somehow transform an arbitrary program to be data-race-free, then we can provide deterministic replay cheaply for that program. To perform this transformation, Chimera uses a sound static data-race detector to find all potential data-races. It then instruments pairs of potentially racing instructions with a weak-lock, which provides sufficient guarantees to allow deterministic replay but does not guarantee mutual exclusion. Unsurprisingly, a large fraction of data-races found by the static tool are false data-races, and instrumenting them each of them with a weak-lock results in prohibitively high overhead. Chimera drastically reduces this cost from 53x to 1.39x by increasing the granularity of weak-locks without significantly compromising on parallelism. This is achieved by employing a combination of profiling and symbolic analysis techniques that target the sources of imprecision in the static data-race detector. We find that performance overhead for deterministic recording is 2.4% on average for Apache and desktop applications and about 86% for scientific applications.
SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism calle...
详细信息
ISBN:
(纸本)9781450312059
SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. In this paper, we propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling, of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.
Parallel programs are known to be difficult to analyze. A key reason is that they typically have an enormous number of execution inter-leavings, or schedules. Static analysis over all schedules requires over-approxima...
详细信息
ISBN:
(纸本)9781450312059
Parallel programs are known to be difficult to analyze. A key reason is that they typically have an enormous number of execution inter-leavings, or schedules. Static analysis over all schedules requires over-approximations, resulting in poor precision;dynamic analysis rarely covers more than a tiny fraction of all schedules. We propose an approach called schedule specialization to analyze a parallel program over only a small set of schedules for precision, and then enforce these schedules at runtime for soundness of the static analysis results. We build a schedule specialization framework for C/C++ multithreaded programs that use Pthreads. Our framework avoids the need to modify every analysis to be schedule-aware by specializing a program into a simpler program based on a schedule, so that the resultant program can be analyzed with stock analyses for improved precision. Moreover, our framework provides a precise schedule-aware def-use analysis on memory locations, enabling us to build three highly precise analyses: an alias analyzer, a data-race detector, and a path slicer. Evaluation on 17 programs, including 2 real-world programs and 15 popular benchmarks, shows that analyses using our framework reduced may-aliases by 61.9%, false race reports by 69%, and path slices by 48.7%;and detected 7 unknown bugs in well-checked programs.
A simple and efficient algorithm for generating bottom-up rewrite system (BURS) tables is described. A small prototype implementation produces tables 10 to 30 times more quickly than the best current techniques. The a...
详细信息
ISBN:
(纸本)0897914759
A simple and efficient algorithm for generating bottom-up rewrite system (BURS) tables is described. A small prototype implementation produces tables 10 to 30 times more quickly than the best current techniques. The algorithm does not require novel data structures or complicated algorithmic techniques. Previously published methods for on-the-fly elimination of states are generalized and simplified to create a new method, triangle trimming, that is employed in the algorithm.
We present an integrated proof language for guiding the actions of multiple reasoning systems as they work together to prove complex correctness properties of imperative programs. The language operates in the context ...
详细信息
ISBN:
(纸本)9781605583921
We present an integrated proof language for guiding the actions of multiple reasoning systems as they work together to prove complex correctness properties of imperative programs. The language operates in the context of a program verification system that uses multiple reasoning systems to discharge generated proof obligations. It is designed to 1) enable developers to resolve key choice points in complex program correctness proofs, thereby enabling automated reasoning systems to successfully prove the desired correctness properties;2) allow developers to identify key lemmas for the reasoning systems to prove, thereby guiding the reasoning systems to find an effective proof decomposition;3) enable multiple reasoning systems to work together productively to prove a single correctness property by providing a mechanism that developers can use to divide the property into lemmas, each of which is suitable for a different reasoning system;and 4) enable developers to identify specific lemmas that the reasoning systems should use when attempting to prove other lemmas or correctness properties, thereby appropriately confining the search space so that the reasoning systems can find a proof in an acceptable amount of time. The language includes a rich set of declarative proof constructs that enables developers to direct the reasoning systems as little or as much as they desire. Because the declarative proof statements are embedded into the program as specialized comments, they also serve as verified documentation and are a natural extension of the assertion mechanism found in most program verification systems. We have implemented our integrated proof language in the context of a program verification system for Java and used the resulting system to verify a collection of linked data structure implementations. Our experience indicates that our proof language makes it possible to successfully prove complex program correctness properties that are otherwise beyond the reach of autom
This paper evaluates three alias analyses based on programminglanguage types. The first analysis uses type compatibility to determine aliases. The second extends the first by using additional high-level information s...
详细信息
ISBN:
(纸本)9780897919876
This paper evaluates three alias analyses based on programminglanguage types. The first analysis uses type compatibility to determine aliases. The second extends the first by using additional high-level information such as field names. The third extends the second with a flow-insensitive analysis. Although other researchers suggests using types to disambiguate memory references, none evaluates its effectiveness. We perform both static and dynamic evaluations of type-based alias analyses for Modula-3, a statically-typed type-safe language. The static analysis reveals that type compatibility alone yields a very imprecise alias analysis, but the other two analyses significantly improve alias precision. We use redundant load elimination (RLE) to demonstrate the effectiveness of the three alias algorithms in terms of the opportunities for optimization, the impact on simulated execution times, and to compute an upper bound on what a perfect alias analysis would yield. We show modest dynamic improvements for (RLE), and more surprisingly, that on average our alias analysis is within 2.5% of a perfect alias analysis with respect to RLE on 8 Modula-3 programs. These results illustrate that to explore thoroughly the effectiveness of alias analyses, researchers need static, dynamic, and upper-bound analysis. In addition, we show that for type-safe languages like Modula-3 and Java, a fast and simple alias analysis may be sufficient for many applications.
暂无评论