Aspect-oriented programming (AOP) is a technique for improving separation of concerns in software design and implementation. AOP works by providing explicit mechanisms for capturing the structure of crosscutting conce...
详细信息
ISBN:
(纸本)9781581133905
Aspect-oriented programming (AOP) is a technique for improving separation of concerns in software design and implementation. AOP works by providing explicit mechanisms for capturing the structure of crosscutting concerns. This tutorial shows how to use AOP to implement crosscutting concerns in a concise modular way. It works with AspectJ, a seamless aspect-oriented extension to the Java(tm) programminglanguage, and with AspectC, an aspect-oriented extension to C in the style of AspectJ. It also includes a description of their underlying model, in terms of which a wide range of AOP languages can be understood.
The high performance implementation of Java Virtual Machines (JVM) and just-in-time (JIT) compilers is directed toward adaptive compilation optimizations on the basis of online runtime profile information. This paper ...
详细信息
ISBN:
(纸本)1581133359
The high performance implementation of Java Virtual Machines (JVM) and just-in-time (JIT) compilers is directed toward adaptive compilation optimizations on the basis of online runtime profile information. This paper describes the design and implementation of a dynamic optimization framework in a production-level Java JIT compiler. Our approach is to employ a mixed mode interpreter and a three level optimizing compiler, supporting quick, full, and special optimization, each of which has a different set of tradeoffs between compilation overhead and execution speed. a lightweight sampling profiler operates continuously during the entire program's exectuion. When necessary, detailed information on runtime behavior is collected by dynmiacally generating instrumentation code which can be installed to and uninstalled from the specified recompilation target code. Value profiling with this instrumentation mechanism allows fully automatic code specialization to be performed on the basis of specific parameter values or global data at the highest optimization level. The experimental results show that our approach offers high performance and a low code expansion ratio in both program startup and steady state measurements in comparison to the compile-only approach, and that the code specialization can also contribute modest performance improvement
We present a new technique for removing unnecessary synchronization operations from statically compiled Java programs. Our approach improves upon current efforts based on escape analysis, as it can eliminate synchroni...
详细信息
ISBN:
(纸本)9781581131994
We present a new technique for removing unnecessary synchronization operations from statically compiled Java programs. Our approach improves upon current efforts based on escape analysis, as it can eliminate synchronization opt rations even on objects that escape their allocating threads. It makes use of a compact, equivalence-class-based representation that eliminates the need for fixed point operations during the analysis. We describe and evaluate the performance of an implementation in the Marmot native Java compiler. For the benchmark programs examined, the optimization removes 100% of the dynamic synchronization operations in single-threaded programs, and 0-99% in multi-threaded programs, at a low cost in additional compilation time and code growth.
Functional Reactive programming, or FRP. is a general framework for programming hybrid systems in a high-level, declarative manner. The key ideas in FRP are its notions of behaviors and events. Behaviors are time-vary...
详细信息
ISBN:
(纸本)9781581131994
Functional Reactive programming, or FRP. is a general framework for programming hybrid systems in a high-level, declarative manner. The key ideas in FRP are its notions of behaviors and events. Behaviors are time-varying, reactive value while events are time-ordered sequences of discrete-time event occurrences. FRP is the essence of Fran, a domain-specific language embedded in Haskell for programming reactive animations, but FRP is now also being used in vision, robotics and other control systems applications. In this paper we explore the formal semantics of FRP and how it relates to an implementation based on streams that represent (and therefore only approximate) continuous behaviors. We show that, in the limit as the sampling interval goes to zero, the implementation is faithful to the formal, continuous semantics, but only when certain constraints on behaviors are observed. We explore the nature of these constraints, which vary amongst the FRP primitives. Our results show both the pou er and limitations of this approach to languagedesign and implementation. As an example of a limitation, we show that streams are incapable of representing instantaneous predicate events over behaviors.
We present a new limited form of interprocedural analysis called field analysis that can be used by a compiler to reduce the costs of modern language features such as object-oriented programming, automatic memory mana...
详细信息
We present a new limited form of interprocedural analysis called field analysis that can be used by a compiler to reduce the costs of modern language features such as object-oriented programming, automatic memory management, and run-time checks required for type safety. Unlike many previous interprocedural analyses, our analysis is cheap, and does not require access to the entire program. Field analysis exploits the declared access restrictions placed on fields in a modular language (e.g. field access modifiers in Java) in order to determine useful properties of fields of an object, We describe our implementation of field analysis in the Swift optimizing compiler for Java, as well a set of optimizations that exploit the results of field analysis. These optimizations include removal of run-time tests, compile-time resolution of method calls, object inlining, removal of unnecessary synchronization, and stack allocation. Our results demonstrate that field analysis is efficient and effective. Speedups average 7% on a wide range of applications, with some times reduced by up to 27%. Compile time overhead of field analysis is about 10%.
We present mechanisms that enable our compiler-target language, C--, to express four of the best known techniques for implementing exceptions, all within a single, uniform framework. We define the mechanisms precisely...
详细信息
We present mechanisms that enable our compiler-target language, C--, to express four of the best known techniques for implementing exceptions, all within a single, uniform framework. We define the mechanisms precisely, using a formal operational semantics. We also show that exceptions need not require special treatment in the optimizer;by introducing extra dataflow edges, we make standard optimization techniques work even on programs that use exceptions. Our approach clarifies the design space of exception-handling techniques, and it allows a single optimizer to handle a variety of implementation techniques. Our ultimate goal is to allow a source-language compiler the freedom to choose its exception-handling policy, while encapsulating the architecture-dependent mechanisms and their optimization in an implementation of C-- that can be used by compilers for many source languages.
This paper presents a new approach to focal instruction scheduling based on integer programming that produces optimal instruction schedules in a reasonable time, even for very large basic blocks. The new approach firs...
详细信息
ISBN:
(纸本)9781581131994
This paper presents a new approach to focal instruction scheduling based on integer programming that produces optimal instruction schedules in a reasonable time, even for very large basic blocks. The new approach first uses a set of graph transformations to simplify the data-dependency graph while preserving the optimality of the final schedule. The simplified graph results in a simplified integer program which can be solved much faster. A new integer-programming formulation is then applied to the simplified graph. Various techniques are used to simplify the formulation, resulting in fewer integer-program variables, fever integer-program constraints and fewer terms in some of the remaining constraints, thus reducing integer-program solution time. The new formulation also uses certain adaptively added constraints (cuts) to reduce solution time. The proposed optimal instruction scheduler is built within the Gnu Compiler Collection (GCC) and is evaluated experimentally using the SPEC95 floating point benchmarks. Although optimal scheduling for the target processor is considered intractable, all of the benchmarks' basic blocks are optimally scheduled, including blocks with up to 1000 instructions, while total compile time increases by only 14%.
We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes on the processor. Th...
详细信息
We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes on the processor. The input native instruction stream to Dynamo can be dynamically generated (by a JIT for example), or it can come from the execution of a statically compiled native binary. This paper evaluates the Dynamo system in the latter, more challenging situation, in order to emphasize the limits, rather than the potential, of the system. Our experiments demonstrate that even statically optimized native binaries can be accelerated Dynamo, and often by a significant degree. For example, the average performance of -O optimized SpecInt95 benchmark binaries created by the HP product C compiler is improved to a level comparable to their -O4 optimized version running without Dynamo. Dynamo achieves this by focusing its efforts on optimization opportunities that tend to manifest only at runtime, and hence opportunities that might be difficult for a static compiler to exploit. Dynamo's operation is transparent in the sense that it does not depend on any user annotations or binary instrumentation, and does not require multiple runs, or any special compiler, operating system or hardware support. The Dynamo prototype presented here is a realistic implementation running on an HP PA-8000 workstation under the HPUX 10.20 operating system.
This paper presents a static race detection analysis for multithreaded Java programs. Our analysis is based on a formal type system that is capable of capturing many common synchronization patterns. These patterns inc...
详细信息
This paper presents a static race detection analysis for multithreaded Java programs. Our analysis is based on a formal type system that is capable of capturing many common synchronization patterns. These patterns include classes with internal synchronization, classes that require client-side synchronization, and thread-local classes. Experience checking over 40,000 lines of Java code with the type system demonstrates that it is an effective approach for eliminating races conditions. On large examples, fewer than 20 additional type annotations per 1000 lines of code were required by the type checker, and pie found a number of races in the standard Java libraries and other test programs.
In this paper we present a modular interprocedural pointer analysis algorithm based on access-paths for C programs. We argue that access paths can reduce the overhead of representing context-sensitive transfer functio...
详细信息
In this paper we present a modular interprocedural pointer analysis algorithm based on access-paths for C programs. We argue that access paths can reduce the overhead of representing context-sensitive transfer functions and effectively distinguish non-recursive heap objects. And when the modular analysis paradigm is used together with other techniques to handle type casts and function pointers, we are able to handle significant programs like those in the SPECcint92 and SPECcint95 suites. We have implemented the algorithm and tested it on a Pentium II 450 PC running Linux. The observed resource consumption and performance improvement are very encouraging.
暂无评论