Higher order functional programs constantly allocate objects dynamically. These objects are typically cons cells, closures, and records and are generally allocated in the heap and reclaimed later by some garbage colle...
详细信息
ISBN:
(纸本)0897914759
Higher order functional programs constantly allocate objects dynamically. These objects are typically cons cells, closures, and records and are generally allocated in the heap and reclaimed later by some garbage collection process. This paper describes a compile time analysis, called escape analysis, for determining the lifetime of dynamically created objects in higher order functional programs, and describes optimizations that can be performed, based on the analysis, to improve storage allocation and reclamation of such objects. In particular, our analysis can be applied to programs manipulating lists, in which case optimizations can be performed to allow cons cells in spines of lists to be either reclaimed immediately or reused without incurring any garbage collection overhead. In a previous paper on escape analysis, we had left open the problem of performing escape analysis on lists. Escape analysis simply determines when the argument (or some part of the argument) to a function call is returned by that call. This simple piece of information turns out to be sufficiently powerful to allow stack allocation of objects, compile-time garbage collection, reduction of run-time storage reclamation overhead, and other optimizations that are possible when the lifetimes of objects can be computed statically. Our approach is to define a high-level non-standard semantics that, in many ways, is similar to the standard semantics and captures the escape behavior caused by the constructs in a functional language. The advantage of our analysis lies in its conceptual simplicity and portability (i.e. no assumption is made about an underlying abstract machine).
A high-performance implementation of a Java Virtual Machine requires a compiler to translate Java bytecodes into native instructions, as well as an advanced garbage collector (e.g., copying or generational). When the ...
详细信息
A high-performance implementation of a Java Virtual Machine requires a compiler to translate Java bytecodes into native instructions, as well as an advanced garbage collector (e.g., copying or generational). When the Java heap is exhausted and the garbage collector executes, the compiler must report to the garbage collector all live object references contained in physical registers and stack locations. Typical compilers only allow certain instructions (e.g., call instructions and backward branches) to be GC-safe;if GC happens at some other instruction, the compiler may need to advance execution to the next GC-safe point. Until now, no one has ever attempted to make every compiler-generated instruction GC-safe, due to the perception that recording this information would require too much space. This kind of support could improve the GC performance in multithreaded applications. We show how to use simple compression techniques to reduce the size of the GC map to about 20% of the generated code size, a result that is competitive with the best previously published results. In addition, we extend the work of Agesen, Detlefs, and Moss, regarding the so-called `JSR Problem' (the single exception to Java's type safety property), in a way that eliminates the need for extra runtime overhead in the generated code.
We demonstrate a tool to incrementally synchronize an acme architectural model described in the acme Architectural Description language (ADL) with an implementation in ArchJava, an extension of the Java programming la...
详细信息
ISBN:
(纸本)9781581139631
We demonstrate a tool to incrementally synchronize an acme architectural model described in the acme Architectural Description language (ADL) with an implementation in ArchJava, an extension of the Java programminglanguage that includes explicit architectural modeling constructs.
A number of effective error detection tools have been built in recent years to check if a program conforms to certain design rules, An important class of design rules deals with sequences of events associated with a s...
详细信息
A number of effective error detection tools have been built in recent years to check if a program conforms to certain design rules, An important class of design rules deals with sequences of events associated with a set of related objects. This paper presents a language called PQL (Program Query language) that allows programmers to express such questions easily in an application-specific context. A query looks like a code excerpt corresponding to the shortest amount of code that would violate a design rule. Details of the target application's precise implementation are abstracted away. The programmer may also specify actions to perform when a match is found, such as recording relevant information or even correcting an erroneous execution on the fly. We have developed both static and dynamic techniques to find solutions to PQL queries, Our static analyzer finds all potential matches conservatively using a context-sensitive, flow-insensitive, inclusion-based pointer alias analysis, Static results are also useful in reducing the number of instrumentation points for dynamic analysis. Our dynamic analyzer instruments the source program to catch all violations precisely as the program runs and to optionally perform user-specified actions. We have implemented the techniques described in this paper and found 206 errors in 6 large real-world open-source Java applications containing a total of nearly 60,000 classes. These errors are important security flaws, resource leaks, and violations of consistency invariants. The combination of static and dynamic analysis proves effective at addressing a wide range of debugging and program comprehension queries. We have found that dynamic analysis is especially suitable for preventing errors such as security vulnerabilities at runtime. Copyright 2005acm.
We describe a highly practical program specializer for Java programs. The specializer is powerful, because it specializes optimistically, using (potentially transient) constants in the heap;it is precise, because it s...
详细信息
ISBN:
(纸本)1595930310
We describe a highly practical program specializer for Java programs. The specializer is powerful, because it specializes optimistically, using (potentially transient) constants in the heap;it is precise, because it specializes using data structures that are only partially invariant;it is deployable, because it is hidden in a JIT compiler and does not require any user annotations or offline preprocessing;it is simple, because it uses existing JIT compiler ingredients;and it is fast, because it specializes programs in under Is. These properties are the result of (1) a new algorithm for selecting specializable code fragments, based on a notion of influence;(2) a precise store profile for identifying constant heap locations;and (3) an efficient invalidation mechanism for monitoring optimistic assumptions about heap constants. Our implementation of the specializer in the Jikes RVM has low overhead, selects specialization points that would be chosen manually, and produces speedups ranging from a factor of 1.2 to 6.4, comparable with annotation-guided specializers. Copyright 2005acm.
Load-reuse analysis finds instructions that repeatedly access the same memory location. This location can be promoted to a register, eliminating redundant loads by reusing the results of prior memory accesses. This pa...
详细信息
Load-reuse analysis finds instructions that repeatedly access the same memory location. This location can be promoted to a register, eliminating redundant loads by reusing the results of prior memory accesses. This paper develops a load-reuse analysis and designs a method for evaluating its precision. In designing the analysis, we aspire for completeness - the goal of exposing all reuse that can be harvested by a subsequent program transformation. For register promotion, a suitable transformation is partial redundancy elimination (PRE). To approach the ideal goal of PRE-completeness, the load-reuse analysis is phrased as a data-flow problem on a program representation that is path-sensitive, as it detects reuse even when it originates in a different instruction along each control flow path. Furthermore, the analysis is comprehensive, as it treats scalar, array and pointer-based loads uniformly. In evaluating the analysis, we compare it with an ideal analysis. By observing the run-time stream of memory references, we collect all PRE-exploitable reuse and treat it as the ideal analysis performance. To compare the (static) load-reuse analysis with the (dynamic) ideal reuse, we use an estimator algorithm that computes, given a data-flow solution and a program profile, the dynamic amount of reuse detected by the analysis. We developed a family of estimators that differ in how well they bound the profiling error inherent in the edge profile. By bounding the error, the estimators offer a precise and practical method for determining the run-time optimization benefit. Our experiments show that about 55% of loads executed in Spec95 exhibit reuse. Of those, our analysis exposes about 80%.
The contribution of this work is the design, implementation, and early evaluation of a programminglanguage that unifies classes and aspects. We call our new module construct the classpect. We make three basic claims....
详细信息
The contribution of this work is the design, implementation, and early evaluation of a programminglanguage that unifies classes and aspects. We call our new module construct the classpect. We make three basic claims. First, we can realize a unified design without significantly compromising the expressiveness of current aspect languages. Second, such a design improves the conceptual integrity of the programming model. Third, it significantly improves the compositionality of aspect modules, expanding the program design space from the two-layered model of AspectJ-like languages to include hierarchical structures. To support these claims, we present the design and implementation of Eos-U, an AspectJ-like language based on C# that supports classpects as the basic unit of modularity. We show that Eos-U supports layered designs in which classpects separate integration concerns flexibly at multiple levels of composition. The underpinnings of our design include support for aspect instantiation under program control, instance-level advising, advising as a general alternative to object-oriented method invocation and overriding, and the provision of a separate join-point-method binding construct. Copyright 2005acm.
This paper develops a methodology for compiling and executing irregular parallel programs. Such programs implement parallel operations whose size and work distribution depend on input data. We show a fundamental relat...
详细信息
ISBN:
(纸本)0897914759
This paper develops a methodology for compiling and executing irregular parallel programs. Such programs implement parallel operations whose size and work distribution depend on input data. We show a fundamental relationship between three quantities that characterize an irregular parallel computation: the total available parallelism, the optimal grain size, and the statistical variance of execution times for individual tasks. This relationship yields a dynamic scheduling algorithm that substantially reduces the overhead of executing irregular parallel operations. We incorporated this algorithm into an extended Fortran compiler. The compiler accepts as input a subset of Fortran D which includes blocked and cyclic decompositions and perfect alignment;it outputs Fortran 77 augmented with calls to library routines written in C. For irregular parallel operations, the compiled code gathers information about available parallelism and task execution time variance and uses this information to schedule the operation. On distributed memory architectures, the compiler encodes information about data access patterns for the runtime scheduling system so that it can preserve communication locality. We evaluated these compilation techniques using a set of application programs including climate modeling, circuit simulation, and x-ray tomography, that contain irregular parallel operations. The results demonstrate that, for these applications, the dynamic techniques described here achieve near-optimal efficiency on large numbers of processors. In addition, they perform significantly better, on these problems, than any previously proposed static or dynamic scheduling algorithm.
暂无评论