Over the past few years, my colleagues and I have written and distributed a number of general purpose libraries covering a wide range of computing areas such as I/O, memory allocation, container data types, and sortin...
详细信息
Over the past few years, my colleagues and I have written and distributed a number of general purpose libraries covering a wide range of computing areas such as I/O, memory allocation, container data types, and sorting. Published studies showed that these libraries are more general, flexible and efficient than comparable packages as application construction tools. Our libraries are based on an architecture in which two main interfaces are made explicit: disciplines to define resource requirements, and methods to define resource management. This paper discusses the discipline and method library architecture and a resource-oriented analysis approach for analyzing and designing libraries based on this architecture.
Multi-FPGA systems are mainly composed of programmable logic devices and external memory. Up-to-date FPGAs also contain embedded static RAMs, which have shorter access time than the external SRAMs. This paper presents...
详细信息
ISBN:
(纸本)0818684798
Multi-FPGA systems are mainly composed of programmable logic devices and external memory. Up-to-date FPGAs also contain embedded static RAMs, which have shorter access time than the external SRAMs. This paper presents a dataflow-oriented algorithm that makes use of the small embedded memories as local caches for data processing. The algorithm offers high level of parallelism and an efficient use of processing resources. This is done in the context of hardware-software co-design. The objective is to automatically implement parts of C-code requiring high processing rate on a reconfigurable system. An example of implementation on a 400 Kgates 8 Mbytes multi-FPGA system will be described.
Previously, we developed a type system to ensure secure information flow in a sequential, imperative programming language [VSI96]. Program variables are classified as either high or low security;intuitively, we wish t...
详细信息
ISBN:
(纸本)9780897919791
Previously, we developed a type system to ensure secure information flow in a sequential, imperative programming language [VSI96]. Program variables are classified as either high or low security;intuitively, we wish to prevent information from flowing from high variables to low variables. Here, we extend the analysis to deal with a multi-threaded language. We show that the previous type system is insufficient to ensure a desirable security property called noninterference. Noninterference basically means that the final values of low variables are independent of the initial values of high variables. By modifying the sequential type system, we are able to guarantee noninterference for concurrent programs. Crucial to this result, however, is the use of purely nondeterministic thread scheduling. Since implementing such scheduling is problematic, we also show how a more restrictive type system can guarantee noninterference, given a more deterministic (and easily implementable) scheduling policy, such as round-robin time slicing. Finally, we consider the consequences of adding a clock to the language.
In standard control-flow analyses for higher-order languages, a single abstract binding for a variable represents a set of exact bindings, and a single abstract reference cell represents a set of exact reference cells...
详细信息
In standard control-flow analyses for higher-order languages, a single abstract binding for a variable represents a set of exact bindings, and a single abstract reference cell represents a set of exact reference cells. While such analyses provide useful may-alias information, they are unable to answer must-alias questions about variables and cells, as these questions ask about equality of specific bindings and references. In this paper, we present a novel program analysis for higher-order languages that answers must-alias questions. At every program point, the analysis associates with each variable and abstract cell a cardinality, which is either single or multiple. If variable x is single at program point p, then all bindings for x in the heap reachable from the environment at p hold the same value. If abstract cell r is single at p, then at most one exact cell corresponding to r is reachable from the environment at p. Must-alias information facilitates various program optimizations such as lightweight closure conversion. In addition, must-alias information permits analyses to perform strong updates on abstract reference cells known to be single. Strong updates improve analysis precision for programs that make significant use of state. A prototype implementation of our analysis yields encouraging results. Over a range of benchmarks, our analysis classifies a large majority of the variables as single.
Many parallel programs are written in SPMD style, i.e. by running the same sequential program on all processes. SPMD programs include synchronization, but it is easy to write incorrect synchronization patterns. We pro...
详细信息
ISBN:
(纸本)9780897919791
Many parallel programs are written in SPMD style, i.e. by running the same sequential program on all processes. SPMD programs include synchronization, but it is easy to write incorrect synchronization patterns. We propose a system that verifies a program's synchronization pattern. We also propose language features to make the synchronization pattern more explicit and easily checked. We have implemented a prototype of our system for Split-C and successfully verified the synchronization structure of realistic programs.
Much research has been devoted to studies of and algorithms for memory management based on garbage collection or explicit allocation and deallocation. An alternative approach, region-based memory management, has been ...
详细信息
Much research has been devoted to studies of and algorithms for memory management based on garbage collection or explicit allocation and deallocation. An alternative approach, region-based memory management, has been known for decades, but has not been well-studied. In a region-based system each allocation specifies a region, and memory is reclaimed by destroying a region, freeing all the storage allocated therein. We show that on a suite of allocation-intensive C programs, regions are competitive with malloc/free and sometimes substantially faster. We also show that regions support safe memory management with low overhead. Experience with our benchmarks suggests that modifying many existing programs to use regions is not difficult.
This paper formalizes the folklore result that strongly-typed applets are more secure than untyped ones. We formulate and prove several security properties that all well-typed applets possess, and identify sufficient ...
详细信息
ISBN:
(纸本)9780897919791
This paper formalizes the folklore result that strongly-typed applets are more secure than untyped ones. We formulate and prove several security properties that all well-typed applets possess, and identify sufficient conditions for the applet execution environment to be safe, such as procedural encapsulation, type abstraction, and systematic type-based placement of run-time checks. These results are a first step towards formal techniques for developing and validating safe execution environments for applets.
The conditional branch has long been considered an expensive operation. The relative cost of conditional branches has increased as recently designed machines are now relying on deeper pipelines and higher multiple iss...
详细信息
ISBN:
(纸本)9780897919876
The conditional branch has long been considered an expensive operation. The relative cost of conditional branches has increased as recently designed machines are now relying on deeper pipelines and higher multiple issue. Reducing the number of conditional branches executed can often result in a substantial performance benefit. This paper describes a code-improving transformation to reorder sequences of conditional branches. First, sequences of branches that can be reordered are detected in the control flow. Second, profiling information is collected to predict the probability that each branch will transfer control out of the sequence. Third, the cost of performing each conditional branch is estimated. Fourth, the most beneficial ordering of the branches based on the estimated probability and cost is selected. The most beneficial ordering often included the insertion of additional conditional branches that did not previously exist in the sequence. Finally, the control flow is restructured to reflect the new ordering. The results of applying the transformation were significant reductions in the dynamic number of instructions and branches, as well as decreases in execution time.
The problems involved in developing efficient parallel programs have proved harder than those in developing efficient sequential ones, both for programmers and for compilers. Although program calculation has been foun...
详细信息
ISBN:
(纸本)9780897919791
The problems involved in developing efficient parallel programs have proved harder than those in developing efficient sequential ones, both for programmers and for compilers. Although program calculation has been found to be a promising way to solve these problems in the sequential world, we believe that it needs much more effort to study its effective use in the parallel world. In this paper, we propose a calculational framework for the derivation of efficient parallel programs with two main innovations: We propose a novel inductive synthesis lemma based on which an elementary but powerful parallelization theorem is developed. We make the first attempt to construct a calculational algorithm for parallelization, deriving associative operators from data type definition and making full use of existing fusion and tupling calculations. Being more constructive, our method is not only helpful in the design of efficient parallel programs in general but also promising in the construction of parallelizing compiler. Several interesting examples are used for illustration.
Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loop-invariant computations. Achieving a complete PRE while incurring an ac...
详细信息
ISBN:
(纸本)9780897919876
Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loop-invariant computations. Achieving a complete PRE while incurring an acceptable code growth is the main focus of the study. An algorithm for complete removal of partial redundancies, based on the integration of code motion and control flow restructuring is presented. In contrast to existing complete techniques, resort to restructuring merely to remove obstacles to code motion, rather than to carry out the actual optimization.
暂无评论