Existing research understates the benefits that can be obtained from inlining and cloning, especially when guided by profile information. Our implementation of inlining and cloning yields excellent results on average ...
详细信息
Existing research understates the benefits that can be obtained from inlining and cloning, especially when guided by profile information. Our implementation of inlining and cloning yields excellent results on average and very rarely lowers performance. We believe our good results can be explained by a number of factors: inlining at the intermediate-code level removes most technical restrictions on what can be inlined;the ability to inline across files and incorporate profile information enables us to choose better inline candidates;a high-quality back end can exploit the scheduling and register allocation opportunities presented by larger subroutines;an aggressive processor architecture benefits from more predictable branch behavior;and a large instruction cache mitigates the impact of code expansion. We describe the often dramatic impact of our inlining and cloning on performance: for example, the implementations of our inlining and cloning algorithms in the HP-UX 10.20 compilers boost SPECint95 performance on a PA8000-based workstation by a factor of 1.32.
The member lookup problem in C++ is the problem of resolving a specified member name in the context of a specified class. Member lookup in C++ is complicated by the presence of virtual inheritance and multiple inherit...
详细信息
The member lookup problem in C++ is the problem of resolving a specified member name in the context of a specified class. Member lookup in C++ is complicated by the presence of virtual inheritance and multiple inheritance. In this paper, we present an efficient algorithm for member lookup in C++. We also present a formalism for the multiple inheritance mechanism of C++, which we use as the basis for deriving our algorithm. The formalism may also be of use as a formal basis for deriving other C++ compiler algorithms.
It is difficult to map the execution model of multithreading languages (languages which support fine-grain dynamic thread creation) onto the single stack execution model of C. Consequently, previous work on efficient ...
详细信息
It is difficult to map the execution model of multithreading languages (languages which support fine-grain dynamic thread creation) onto the single stack execution model of C. Consequently, previous work on efficient multithreading uses elaborate frame formats and allocation strategy, with compilers customized for them. This paper presents an alternative cost-effective implementation strategy for multithreading languages which can maximally exploit current sequential C compilers. We identify a set of primitives whereby efficient dynamic thread creation and switch can be achieved and clarify implementation issues and solutions which work under the stack frame layout and calling conventions of current C compilers. The primitives are implemented as a C library and named StackThreads. In StackThreads, a thread creation is done just by a C procedure call, maximizing thread creation performance. When a procedure suspends an execution, the context of the procedure, which is roughly a stack frame of the procedure, is saved into heap and resumed later. With StackThreads, the compiler writer can straightforwardly translate sequential constructs of the source language into corresponding C statements or expressions, while using StackThreads primitives as a blackbox mechanism which switches execution between C procedures.
Two important problems of object-oriented reuse are the propagation of design and implementation specifics of the base software to the inheritors, and the protection of the inheritors against changes in the base softw...
详细信息
Two important problems of object-oriented reuse are the propagation of design and implementation specifics of the base software to the inheritors, and the protection of the inheritors against changes in the base software. In this paper, we argue that the simple inheritance rules of existing object-oriented languages are not sufficient for properly dealing with these problems. In the proposal presented in this paper, programmers are enabled to make metalevel declarations of the internal protocols and dependencies of their classes. Additionally, changes of the base module are automatically monitored to filter out information about the alterations that may invalidate already existing inheritors. Based on these informations, the subclassing semantics is adjusted such that the maintenance of the base module properties and the protection of the inheritor is ensured during their integration. In this way, language support is provided for keeping the behavior of reusable software consistent during its evolution.
We present an approach for specializing large programs, such as programs consisting of several modules, or libraries. This approach is based on the idea of using a compiler generator (cogen) for creating generating ex...
详细信息
We present an approach for specializing large programs, such as programs consisting of several modules, or libraries. This approach is based on the idea of using a compiler generator (cogen) for creating generating extensions. Generating extensions are specializers specialized with respect to some input program. When run on some input data the generating extension produces a specialized version of the input program. Here we use the cogen to tailor modules for specialization. This happens once and for all, independently of all other modules. The resulting module can then be used as a building block for generating extensions for complete programs, in much the same way as the original modules can be put together into complete programs. The result of running the final generating extension is a collection of residual modules, with a module structure derived from the original program.
If a fixed exponentially decreasing probability distribution function is used to model every object's lifetime, then the age of an object gives no information about its future life expectancy. This radioactive dec...
详细信息
ISBN:
(纸本)9780897919074
If a fixed exponentially decreasing probability distribution function is used to model every object's lifetime, then the age of an object gives no information about its future life expectancy. This radioactive decay model implies there can be no rational basis for deciding which live objects should be promoted to another generation. Yet there remains a rational basis for deciding how many objects to promote, when to collect garbage, and which generations to collect. Analysis of the model leads to a new kind of generational garbage collector whose effectiveness does not depend upon heuristics that predict which objects will live longer than others. This result provides insight into the computational advantages of generational garbage collection, with implications for the management of objects whose life expectancies are difficult to predict.
Modulo scheduling algorithms based on optimal solvers have been proposed to investigate and tune the performance of modulo scheduling heuristics. While recent advances have broadened the scope for which the optimal ap...
详细信息
Modulo scheduling algorithms based on optimal solvers have been proposed to investigate and tune the performance of modulo scheduling heuristics. While recent advances have broadened the scope for which the optimal approach is applicable, this approach increasingly suffers from large execution times. In this paper, we propose a more efficient formulation of the modulo scheduling space that significantly decreases the execution time of solvers based on integer linear programs. For example, the total execution time is reduced by a factor of 8.6 when 782 loops from the Perfect Club, SPEC, and Livermore Fortran Kernels are scheduled for minimum register requirements using the more efficient formulation instead of the traditional formulation. Experimental evidence further indicates that significantly larger loops can be scheduled under realistic machine constraints.
Branch alignment reorders the basic blocks of a program to minimize pipeline penalties due to control-transfer instructions. Prior work in branch alignment has produced useful heuristic methods. We present a branch al...
详细信息
ISBN:
(纸本)9780897919074
Branch alignment reorders the basic blocks of a program to minimize pipeline penalties due to control-transfer instructions. Prior work in branch alignment has produced useful heuristic methods. We present a branch alignment algorithm that usually achieves the minimum possible pipeline penalty and on our benchmarks averages within 0.3% of a provable optimum. We compare the control penalties and running times of our algorithm to an older, greedy approach and observe that both the greedy method and our method are close to the lower bound on control penalties, suggesting that greedy is good enough. Surprisingly, in actual execution our method produces programs that run noticeably faster than the greedy method. We also report results from training and testing on different data sets, validating that our results can be achieved in real-world usage. Training and testing on different data sets slightly reduced the benefits from both branch alignment algorithms, but the ranking of the algorithms does not change, and the bulk of the benefits remain.
We present a technique for automatic verification of pointer programs based on a decision procedure for the monadic second-order logic on finite strings. We are concerned with a while-fragment of Pascal, which include...
详细信息
We present a technique for automatic verification of pointer programs based on a decision procedure for the monadic second-order logic on finite strings. We are concerned with a while-fragment of Pascal, which includes recursively-defined pointer structures but excludes pointer arithmetic. We define a logic of stores with interesting basic predicates such as pointer equality, tests for nil pointers, and garbage cells, as well as reachability along pointers. We present a complete decision procedure for Hoare triples based on this logic over loop-free code. Combined with explicit loop invariants, the decision procedure allows us to answer surprisingly detailed questions about small but non-trivial programs. If a program fails to satisfy a certain property, then we can automatically supply an initial store that provides a counterexample. Our technique has been fully and efficiently implemented for linear linked lists, and it extends in principle to tree structures. The resulting system can be used to verify extensive properties of smaller pointer programs and could be particularly useful in a teaching environment.
Ada 95 is being used as the implementationlanguage for a senior level compiler design course at the United States Military Academy. This paper describes experiences and lessons learned as well as the scenario based a...
详细信息
暂无评论