It is difficult to map the execution model of multithreading languages (languages which support fine-grain dynamic thread creation) onto the single stack execution model of C. Consequently, previous work on efficient ...
详细信息
It is difficult to map the execution model of multithreading languages (languages which support fine-grain dynamic thread creation) onto the single stack execution model of C. Consequently, previous work on efficient multithreading uses elaborate frame formats and allocation strategy, with compilers customized for them. This paper presents an alternative cost-effective implementation strategy for multithreading languages which can maximally exploit current sequential C compilers. We identify a set of primitives whereby efficient dynamic thread creation and switch can be achieved and clarify implementation issues and solutions which work under the stack frame layout and calling conventions of current C compilers. The primitives are implemented as a C library and named StackThreads. In StackThreads, a thread creation is done just by a C procedure call, maximizing thread creation performance. When a procedure suspends an execution, the context of the procedure, which is roughly a stack frame of the procedure, is saved into heap and resumed later. With StackThreads, the compiler writer can straightforwardly translate sequential constructs of the source language into corresponding C statements or expressions, while using StackThreads primitives as a blackbox mechanism which switches execution between C procedures.
Two important problems of object-oriented reuse are the propagation of design and implementation specifics of the base software to the inheritors, and the protection of the inheritors against changes in the base softw...
详细信息
Two important problems of object-oriented reuse are the propagation of design and implementation specifics of the base software to the inheritors, and the protection of the inheritors against changes in the base software. In this paper, we argue that the simple inheritance rules of existing object-oriented languages are not sufficient for properly dealing with these problems. In the proposal presented in this paper, programmers are enabled to make metalevel declarations of the internal protocols and dependencies of their classes. Additionally, changes of the base module are automatically monitored to filter out information about the alterations that may invalidate already existing inheritors. Based on these informations, the subclassing semantics is adjusted such that the maintenance of the base module properties and the protection of the inheritor is ensured during their integration. In this way, language support is provided for keeping the behavior of reusable software consistent during its evolution.
We present an approach for specializing large programs, such as programs consisting of several modules, or libraries. This approach is based on the idea of using a compiler generator (cogen) for creating generating ex...
详细信息
We present an approach for specializing large programs, such as programs consisting of several modules, or libraries. This approach is based on the idea of using a compiler generator (cogen) for creating generating extensions. Generating extensions are specializers specialized with respect to some input program. When run on some input data the generating extension produces a specialized version of the input program. Here we use the cogen to tailor modules for specialization. This happens once and for all, independently of all other modules. The resulting module can then be used as a building block for generating extensions for complete programs, in much the same way as the original modules can be put together into complete programs. The result of running the final generating extension is a collection of residual modules, with a module structure derived from the original program.
If a fixed exponentially decreasing probability distribution function is used to model every object's lifetime, then the age of an object gives no information about its future life expectancy. This radioactive dec...
详细信息
ISBN:
(纸本)9780897919074
If a fixed exponentially decreasing probability distribution function is used to model every object's lifetime, then the age of an object gives no information about its future life expectancy. This radioactive decay model implies there can be no rational basis for deciding which live objects should be promoted to another generation. Yet there remains a rational basis for deciding how many objects to promote, when to collect garbage, and which generations to collect. Analysis of the model leads to a new kind of generational garbage collector whose effectiveness does not depend upon heuristics that predict which objects will live longer than others. This result provides insight into the computational advantages of generational garbage collection, with implications for the management of objects whose life expectancies are difficult to predict.
Modulo scheduling algorithms based on optimal solvers have been proposed to investigate and tune the performance of modulo scheduling heuristics. While recent advances have broadened the scope for which the optimal ap...
详细信息
Modulo scheduling algorithms based on optimal solvers have been proposed to investigate and tune the performance of modulo scheduling heuristics. While recent advances have broadened the scope for which the optimal approach is applicable, this approach increasingly suffers from large execution times. In this paper, we propose a more efficient formulation of the modulo scheduling space that significantly decreases the execution time of solvers based on integer linear programs. For example, the total execution time is reduced by a factor of 8.6 when 782 loops from the Perfect Club, SPEC, and Livermore Fortran Kernels are scheduled for minimum register requirements using the more efficient formulation instead of the traditional formulation. Experimental evidence further indicates that significantly larger loops can be scheduled under realistic machine constraints.
Branch alignment reorders the basic blocks of a program to minimize pipeline penalties due to control-transfer instructions. Prior work in branch alignment has produced useful heuristic methods. We present a branch al...
详细信息
ISBN:
(纸本)9780897919074
Branch alignment reorders the basic blocks of a program to minimize pipeline penalties due to control-transfer instructions. Prior work in branch alignment has produced useful heuristic methods. We present a branch alignment algorithm that usually achieves the minimum possible pipeline penalty and on our benchmarks averages within 0.3% of a provable optimum. We compare the control penalties and running times of our algorithm to an older, greedy approach and observe that both the greedy method and our method are close to the lower bound on control penalties, suggesting that greedy is good enough. Surprisingly, in actual execution our method produces programs that run noticeably faster than the greedy method. We also report results from training and testing on different data sets, validating that our results can be achieved in real-world usage. Training and testing on different data sets slightly reduced the benefits from both branch alignment algorithms, but the ranking of the algorithms does not change, and the bulk of the benefits remain.
We present a technique for automatic verification of pointer programs based on a decision procedure for the monadic second-order logic on finite strings. We are concerned with a while-fragment of Pascal, which include...
详细信息
We present a technique for automatic verification of pointer programs based on a decision procedure for the monadic second-order logic on finite strings. We are concerned with a while-fragment of Pascal, which includes recursively-defined pointer structures but excludes pointer arithmetic. We define a logic of stores with interesting basic predicates such as pointer equality, tests for nil pointers, and garbage cells, as well as reachability along pointers. We present a complete decision procedure for Hoare triples based on this logic over loop-free code. Combined with explicit loop invariants, the decision procedure allows us to answer surprisingly detailed questions about small but non-trivial programs. If a program fails to satisfy a certain property, then we can automatically supply an initial store that provides a counterexample. Our technique has been fully and efficiently implemented for linear linked lists, and it extends in principle to tree structures. The resulting system can be used to verify extensive properties of smaller pointer programs and could be particularly useful in a teaching environment.
Ada 95 is being used as the implementationlanguage for a senior level compiler design course at the United States Military Academy. This paper describes experiences and lessons learned as well as the scenario based a...
详细信息
This paper presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the sa...
详细信息
This paper presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code;each version uses a different optimization policy. The generated code alternately performs sampling phases and production phases. Each sampling phase measures the overhead of each version in the current environment. Each production phase uses the version with the least overhead in the previous sampling phase. The computation periodically resamples to adjust dynamically to changes in the environment. We have implemented dynamic feedback in the context of a parallelizing compiler for object-based programs. The generated code uses dynamic feedback to automatically choose the best synchronization optimization policy. Our experimental results show that the synchronization optimization policy has a significant impact on the overall performance of the computation, that the best policy varies from program to program, that the compiler is unable to statically choose the best policy, and that dynamic feedback enables the generated code to exhibit performance that is comparable to that of code that has been manually tuned to use the best policy. We have also performed a theoretical analysis which provides, under certain assumptions, a guaranteed optimality bound for dynamic feedback relative to a hypothetical (and unrealizable) optimal algorithm that uses the best policy at every point during the execution.
A program profile attributes run-time costs to portions of a program's execution. Most profiling systems suffer from two major deficiencies: first, they only apportion simple metrics, such as execution frequency o...
详细信息
ISBN:
(纸本)9780897919074
A program profile attributes run-time costs to portions of a program's execution. Most profiling systems suffer from two major deficiencies: first, they only apportion simple metrics, such as execution frequency or elapsed time to static, syntactic units, such as procedures or statements;second, they aggressively reduce the volume of information collected and reported, although aggregation can hide striking differences in program behavior. This paper addresses both concerns by exploiting the hardware counters available in most modern processors and by incorporating two concepts from data flow analysis-flow and context sensitivity-to report more context for measurements. This paper extends our previous work on efficient path profiling to flow sensitive profiling, which associates hardware performance metrics with a path through a procedure. In addition, it describes a data structure, the calling context tree, that efficiently captures calling contexts for procedure-level measurements. Our measurements show that the SPEC95 benchmarks execute a small number (3-28) of hot paths that account for 9-98% of their L1 data cache misses. Moreover, these hot paths are concentrated in a few routines, which have complex dynamic behavior.
暂无评论