Data dependencies have become one of the main bottlenecks of current superscalar processors. Data speculation is gaining popularity as a mechanism to avoid the ordering imposed by data dependencies. Loads and stores a...
详细信息
ISBN:
(纸本)9780897919029
Data dependencies have become one of the main bottlenecks of current superscalar processors. Data speculation is gaining popularity as a mechanism to avoid the ordering imposed by data dependencies. Loads and stores are very good candidates for data speculation since their effective address has a regular behavior and then, they are highly predictable. In this paper we propose a mechanism called Address Prediction and Data Prefetching that allows load instructions to obtain their data at the decode stage. Besides, the effective address of load and store instructions is also predicted. These instructions and those dependent on them are speculatively executed. The technique has been evaluated for an out-of-order processor with a realistic configuration. The performance gain is about 19% in average and it is much higher for some benchmarks (up to 35%).
Module mechanisms have received considerable theoretical attention, but the associated concepts of separate compilation and linking have not been emphasized. Anomalous module systems have emerged in functional and obj...
详细信息
ISBN:
(纸本)9780897918534
Module mechanisms have received considerable theoretical attention, but the associated concepts of separate compilation and linking have not been emphasized. Anomalous module systems have emerged in functional and object-oriented programming where software components are not separately typecheckable and compilable. In this paper we provide a context where linking can be studied, and separate compilability can be formally stated and checked. We propose a framework where each module is separately compiled to a self-contained entity called a linkset;we show that separately compiled, compatible modules can be safely linked together.
This paper provides a link between the formulation of static program analyses using the framework of abstract interpretation (popular for functional languages) and using the more classical framework of data flow analy...
详细信息
ISBN:
(纸本)0897914813
This paper provides a link between the formulation of static program analyses using the framework of abstract interpretation (popular for functional languages) and using the more classical framework of data flow analysis (popular for imperative languages). In particular we show how the classical notions of fastness, rapidity and k-boundedness carry over to the abstract interpretation framework and how this may be used to bound the number of times a functional should be unfolded in order to yield the fixed point. This is supplemented with a number of results on how to calculate the bounds for iterative forms (as for tail recursion), for linear forms (as for one nested recursive call), and for primitive recursive forms. In some cases this improves the `worst case' results of [9], but more importantly it gives much better `average case' results.
We present a new predicative and decidable type system, called ML≤, suitable for languages that integrate functional programming and parametric polymorphism in the tradition of ML [21, 28], and class-based object-ori...
详细信息
ISBN:
(纸本)9780897918534
We present a new predicative and decidable type system, called ML≤, suitable for languages that integrate functional programming and parametric polymorphism in the tradition of ML [21, 28], and class-based object-oriented programming and higher-order multi-methods in the tradition of CLOS [12]. Instead of using extensible records as a foundation for object-oriented extensions of functional languages, we propose to reinterpret ML datatype declarations as abstract and concrete class declarations, and to replace pattern matching on run-time values by dynamic dispatch on run-time types. ML≤ is based on universally quantified polymorphic constrained types. Constraints are conjunctions of inequalities between monotypes built from type constructors organized into extensible and partially ordered classes. We give type checking rules for a small, explicitly typed functional language a la XML [20] with multi-methods, show that the resulting system has decidable minimal types, and discuss subject reduction. Finally, we propose a new object-oriented programming language based on the ML≤ type system.
Conservative garbage collectors can automatically reclaim unused memory in the absence of precise pointer location information. If a location can possibly contain a pointer, it is treated by the collector as though it...
详细信息
ISBN:
(纸本)9781581134506
Conservative garbage collectors can automatically reclaim unused memory in the absence of precise pointer location information. If a location can possibly contain a pointer, it is treated by the collector as though it contained a pointer. Although it is commonly assumed that this can lead to unbounded space use due to misidentified pointers, such extreme space use is rarely observed in practice, and then generally only if the number of misidentified pointers is itself unbounded. We show that if the program manipulates only data structures satisfying a simple GC-robustness criterion, then a bounded number of misidentified pointers can at most result in increasing space usage by a constant factor. We argue that nearly all common data structures are already GC-robust, and it is typically easy to identify and replace those that are not. Thus it becomes feasible to prove space bounds on programs collected by mildly conservative garbage collectors, such as the one in [2]. The worst-case space overhead introduced by such mild conservatism is comparable to the worst-case fragmentation overhead for inherent in any non-moving storage allocator. The same GC-robustness criterion also ensures the absence of temporary space leaks of the kind discussed in [13] for generational garbage collectors.
With the advent of compositional programming models in computer science, applying planning technologies to automatically build workflows for solving large and complex problems in such a paradigm becomes not only techn...
详细信息
Inlining trials are a general mechanism for making better automatic decisions about whether a routine is profitable to inline. Unlike standard source-level inlining heuristics, an inlining trial captures the effects o...
详细信息
ISBN:
(纸本)0897916433
Inlining trials are a general mechanism for making better automatic decisions about whether a routine is profitable to inline. Unlike standard source-level inlining heuristics, an inlining trial captures the effects of optimizations applied to the body of the inlined routine when calculating the costs and benefits of inlining. The results of inlining trials are stored in a persistent database to be reused when making future inlining decisions at similar call sites. Type group analysis can determine the amount of available static information exploited during compilation, and the results of analyzing the compilation of an inlined routine help decide when a future call site would lead to substantially the same generated code as a given inlining trial. We have implemented inlining trials and type group analysis in an optimizing compiler for SELF, and by making wiser inlining decisions we were able to cut compilation time and compiled code space with virtually no loss of execution speed. We believe that inlining trials and type group analysis could be applied effectively to many high-level languages where procedural or functional abstraction is used heavily.
Floyd's method based on well-orderings is the standard approach to proving termination of programs. Much attention has been devoted to generalizing this method to termination of programs that are subjected to fair...
详细信息
ISBN:
(纸本)0897914953
Floyd's method based on well-orderings is the standard approach to proving termination of programs. Much attention has been devoted to generalizing this method to termination of programs that are subjected to fairness constraints. Earlier methods for fair termination tend to be somewhat indirect, relying on program transformations, which reduce the original problem to several termination problems. In this paper we introduce the new concept of stack assertions, which directly - without transformations - quantify progress towards fair termination. Moreover, we show that by one simple program transformation of adding a history variable, usual assertional logic, without fixed-point operators, is sufficiently expressive to form a sound and relatively complete method when used with stack assertions. This result is obtained as part of a substantial simplification of earlier completeness proofs.
In automatic, retargetable compilation low-cost, analytic cost estimation techniques are crucial in order to efficiently steer the optimization process. programming models aimed at optimum expressiveness of parallelis...
详细信息
In automatic, retargetable compilation low-cost, analytic cost estimation techniques are crucial in order to efficiently steer the optimization process. programming models aimed at optimum expressiveness of parallelism, however, are not amenable to static cost estimation. We present a new coordination model, called SPC, that imposes specific restrictions in the synchronization structures that can be programmed. Imposing these restrictions enables the efficient computation of reliable cost estimations paving the way for automatic optimization. Regarding SPC's limited expressiveness we present a conjecture stating that the loss of parallelism when programming in SPC is typically limited to a constant factor of 2 compared to the unrestricted case. This limited loss is outweighed by the unlocked potential of automatic performance optimization as well as the portability that is achieved. We demonstrate how SPC enables automatic program optimizations through a compilation case study involving a line relaxation kernel on a distributed-memory machine.
This paper presents techniques to model circular lazy programs in a strict, purely functional setting. Circular lazy programs model any algorithm based on multiple traversals over a recursive data structure as a singl...
详细信息
暂无评论