We identify three design principles for reflection and metaprogramming facilities in object oriented programminglanguages. Encapsulation: meta-level facilities must encapsulate their implementation. Stratification: m...
详细信息
ISBN:
(纸本)9781581138313
We identify three design principles for reflection and metaprogramming facilities in object oriented programminglanguages. Encapsulation: meta-level facilities must encapsulate their implementation. Stratification: meta-level facilities must be separated from base-level functionality. Ontological correspondence: the ontology of meta-level facilities should correspond to the ontology of the language they manipulate. Traditional/mainstream reflective architectures do not follow these precepts. In contrast, reflective APIs built around the concept of mirrors are characterized by adherence to these three principles. Consequently, mirror-based architectures have significant advantages with respect to distribution, deployment and general purpose metaprogramming.
The proceedings contains 28 papers from the acmsigplan 2002 conference on programminglanguagedesign and implementation (PLDI'02). Topics discussed include: flow-sensitive type qualifiers;fast copy coalescing an...
详细信息
The proceedings contains 28 papers from the acmsigplan 2002 conference on programminglanguagedesign and implementation (PLDI'02). Topics discussed include: flow-sensitive type qualifiers;fast copy coalescing and live-range identification;preference-directed graph coloring;a system and language for building system-specific, static analyses and deriving specialized program analyses for certifying component-client conformance.
This paper presents the design and implementation of a compiler algorithm that effectively optimizes programs for energy usage using dynamic voltage scaling (DVS). The algorithm identifies program regions where the CP...
详细信息
This paper presents the design and implementation of a compiler algorithm that effectively optimizes programs for energy usage using dynamic voltage scaling (DVS). The algorithm identifies program regions where the CPU can be slowed down with negligible performance loss. It is implemented as a source-to-source level transformation using the SUIF2 compiler infrastructure. Physical measurements on a high-performance laptop show that total system (i.e., laptop) energy savings of up to 28% can be achieved with performance degradation of less than 5% for the SPECfp95 benchmarks. On average, the system energy and energy-delay product are reduced by 11% and 9%, respectively, with a performance slowdown of 2%. It was also discovered that the energy usage of the programs using our DVS algorithm is within 6% from the theoretical lower bound. To the best of our knowledge, this is one of the first work that evaluates DVS algorithms by physical measurements.
We compile Nova, a new languagedesigned for writing network processing applications, using a back end based on integer-linear programming (ILP) for register allocation, optimal bank assignment, and spills. The compil...
详细信息
We compile Nova, a new languagedesigned for writing network processing applications, using a back end based on integer-linear programming (ILP) for register allocation, optimal bank assignment, and spills. The compiler's optimizer employs CPS as its intermediate representation;some of the invariants that this IR guarantees are essential for the formulation of a practical ILP model. Appel and George used a similar ILP-based technique for the IA32 to decide which variables reside in registers but deferred the actual assignment of colors to a later phase. We demonstrate how to carry over their idea to an architecture with many more banks, register aggregates, variables with multiple simultaneous register assignments, and, very importantly, one where bank- and register-assignment cannot be done in isolation from each other. Our approach performs well in practise-without causing an explosion in size or solve time of the generated integer linear programs.
We present nesC, a programminglanguage for networked embedded systems that represent a new design space for application developers. An example of a networked embedded system is a sensor network, which consists of (po...
详细信息
ISBN:
(纸本)9781581136623
We present nesC, a programminglanguage for networked embedded systems that represent a new design space for application developers. An example of a networked embedded system is a sensor network, which consists of (potentially) thousands of tiny, low-power "motes," each of which execute concurrent, reactive programs that must operate with severe memory and power constraints. nesC's contribution is to support the special needs of this domain by exposing a programming model that incorporates event-driven execution, a flexible concurrency model, and component-oriented application design. Restrictions on the programming model allow the nesC compiler to perform whole-program analyses, including data-race detection (which improves reliability) and aggressive function inlining (which reduces resource consumption). nesC has been used to implement TinyOS, a small operating system for sensor networks, as well as several significant sensor applications. nesC and TinyOS have been adopted by a large number of sensor network research groups, and our experience and evaluation of the language shows that it is effective at supporting the complex, concurrent programming style demanded by this new class of deeply networked systems.
An adaptation of the classic register allocation algorithm to the problem of array storage optimization in MATLAB is presented. The method involves the decomposition of an interference graph's color classes using ...
详细信息
ISBN:
(纸本)9781581136623
An adaptation of the classic register allocation algorithm to the problem of array storage optimization in MATLAB is presented. The method involves the decomposition of an interference graph's color classes using inferred type information. A key trait is the use of symbolic types, along with control flow, in performing the decomposition. On a benchmark suite spanning the published test suites of some recent research MATLAB compilers, our implementation produces savings in the average virtual memory size, with respect to code generated by a commercial MATLAB compiler, of between 51% and 139% in 6 out of 11 programs, and savings between 0.7% and 47% in the remaining. In absolute terms, this ranged from 123KB to over 9MB. Substantial improvements in other categories of memory, such as resident sets and dynamic data (stack plus heap), were also observed and are reported. Speedups in execution times of at least over an order of magnitude in 4 programs, of over 100% in 4 of the remaining, and 10% and over in-the rest are also reported.
As more complex DSP algorithms are realized in practice, there is an increasing need for high-level stream abstractions that can be compiled without sacrificing efficiency. Toward this end, we present a set of aggress...
详细信息
ISBN:
(纸本)9781581136623
As more complex DSP algorithms are realized in practice, there is an increasing need for high-level stream abstractions that can be compiled without sacrificing efficiency. Toward this end, we present a set of aggressive optimizations that target linear sections of a stream program. Our input language is StreamIt, which represents programs as a hierarchical graph of autonomous filters. A filter is linear if each of its outputs can be represented as an affine combination of its inputs. Linearity is common in DSP components;examples include FIR filters, expanders, compressors, FFTs and DCTs. We demonstrate that several algorithmic transformations, traditionally hand-tuned by DSP experts, can be completely automated by the compiler. First, we present a linear extraction analysis that automatically detects linear filters from the C-like code in their work function. Then, we give a procedure for combining adjacent linear filters into a single filter, as well as for translating a linear filter to operate in the frequency domain. We also present an optimization selection algorithm, which finds the sequence of combination and frequency transformations that will give the maximal benefit. We have completed a fully-automatic implementation of the above techniques as part of the StreamIt compiler, and we demonstrate a 450% performance improvement over our benchmark suite.
Typed assembly languages provide a way to generate machine-checkable safety proofs for machine-language programs. But the soundness proofs of most existing typed assembly languages are hand-written and cannot be machi...
详细信息
ISBN:
(纸本)9781581136623
Typed assembly languages provide a way to generate machine-checkable safety proofs for machine-language programs. But the soundness proofs of most existing typed assembly languages are hand-written and cannot be machine-checked, which is worrisome for such large calculi. We have designed and implemented a low-level typed assembly language (LTAL) with a semantic model and established its soundness from the model. Compared to existing typed assembly languages, LTAL is more scalable and more secure;it has no macro instructions that hinder low-level optimizations such as instruction scheduling;its type constructors are expressive enough to capture dataflow information, support the compiler's choice of data representations and permit typed position-independent code;and its type-checking algorithm is completely syntax-directed. We have built a prototype system, based on Standard ML of New Jersey, that compiles most of core ML to Sparc code. We explain how we were able to make the untyped back end in SML/NJ preserve types during instruction selection and register allocation, without restricting low-level optimizations and without knowledge of any type system pervading the instruction selector and register allocator.
Software prefetching is a promising technique to hide cache miss latencies, but it remains challenging to effectively prefetch pointer-based data structures because obtaining the memory address to be prefetched requir...
详细信息
Software prefetching is a promising technique to hide cache miss latencies, but it remains challenging to effectively prefetch pointer-based data structures because obtaining the memory address to be prefetched requires pointer dereferences. The recently proposed stride prefetching overcomes this problem, but it only exploits inter-iteration stride patterns and relies on an off-line profiling method. We propose a new algorithm for stride prefetching which is intended for use in a dynamic compiler. We exploit both inter- and intra-iteration stride patterns, which we discover using an ultra-lightweight profiling technique, called object inspection. This is a kind of partial interpretation that only a dynamic compiler can perform. During the compilation of a method, the dynamic compiler gathers the profile information by partially interpreting the method using the actual values of parameters and causing no side effects. We evaluated an implementation of our prefetching algorithm in a production-level Java just-in-time compiler. The results show that the algorithm achieved up to an 18.9% and 25.1% speedup in industry-standard benchmarks on the Pentium 4 and the Athlon MP, respectively, while it increased the compilation time by less than 3.0%.
暂无评论