This paper develops a methodology for compiling and executing irregular parallel programs. Such programs implement parallel operations whose size and work distribution depend on input data. We show a fundamental relat...
详细信息
ISBN:
(纸本)0897914759
This paper develops a methodology for compiling and executing irregular parallel programs. Such programs implement parallel operations whose size and work distribution depend on input data. We show a fundamental relationship between three quantities that characterize an irregular parallel computation: the total available parallelism, the optimal grain size, and the statistical variance of execution times for individual tasks. This relationship yields a dynamic scheduling algorithm that substantially reduces the overhead of executing irregular parallel operations. We incorporated this algorithm into an extended Fortran compiler. The compiler accepts as input a subset of Fortran D which includes blocked and cyclic decompositions and perfect alignment;it outputs Fortran 77 augmented with calls to library routines written in C. For irregular parallel operations, the compiled code gathers information about available parallelism and task execution time variance and uses this information to schedule the operation. On distributed memory architectures, the compiler encodes information about data access patterns for the runtime scheduling system so that it can preserve communication locality. We evaluated these compilation techniques using a set of application programs including climate modeling, circuit simulation, and x-ray tomography, that contain irregular parallel operations. The results demonstrate that, for these applications, the dynamic techniques described here achieve near-optimal efficiency on large numbers of processors. In addition, they perform significantly better, on these problems, than any previously proposed static or dynamic scheduling algorithm.
Organic programming (OP) is our proposed and already emerging programming model which overcomes some of the limitations of current practice in software development in general and of object-oriented programming (OOP) i...
详细信息
We propose an aspect-oriented programming (AOP) language called Aspectual Caml based on a strongly-typed functional language Objective Caml with two AOP mechanisms similar to those in AspectJ language. This paper desc...
详细信息
We propose an aspect-oriented programming (AOP) language called Aspectual Caml based on a strongly-typed functional language Objective Caml with two AOP mechanisms similar to those in AspectJ language. This paper describes the design and implementation issues of those AOP mechanisms that give us insights into the interaction between AOP features and common features in strongly-typed functional languages such as type inference, polymorphic types and curried functions. We implemented a prototype compiler of the language and used the language for separating crosscutting concerns in application programs, including for separating descriptions of a type system from compiler descriptions.
Concepts are an essential language feature for generic programming in the large. Concepts allow for succinct expression of constraints on type parameters of generic algorithms, enable systematic organization of proble...
详细信息
Concepts are an essential language feature for generic programming in the large. Concepts allow for succinct expression of constraints on type parameters of generic algorithms, enable systematic organization of problem domain abstractions, and make generic algorithms easier to use. In this paper we present the design of a type system and semantics for concepts that is suitable for non-type-inferencing languages. Our design shares much in common with the type classes of Haskell, though our primary influence is from best practices in the C++ community, where concepts are used to document type requirements for templates in generic libraries. Concepts include a novel combination of associated types and same-type constraints that do not appear in type classes, but that are similar to nested types and type sharing in ML.
It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during...
详细信息
It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non- Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes. We have designed a modern object-oriented programminglanguage, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places;lightweight activities embodied in async, future, foreach, and ateach constructs;a construct for termination detection (finish);the use of lock-free synchronization (atomic blocks);and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and JAVA™ languages. Copyright 2005acm.
It is currently difficult to build practical and reliable programming systems out of distributed and resource-constrained sensor devices. The state of the art in today's sensornet programming is centered around a ...
详细信息
It is currently difficult to build practical and reliable programming systems out of distributed and resource-constrained sensor devices. The state of the art in today's sensornet programming is centered around a component-based language called nesC. nesC is a node-level language-a program is written for an individual node in the network-and nesC programs use the services of an operating system called TinyOS. We are pursuing an approach to programming sensor networks that significantly raises the level of abstraction over this practice. The critical change is one of perspective: rather than writing programs from the point of view of an individual node, programmers implement a central program that conceptually has access to the entire network. This approach pushes to the compiler the task of producing node-level programs that implement the desired behavior. We present the Pleiades programminglanguage, its compiler, and its runtime. The Pleiades language extends the C language with constructs that allow programmers to name and access node-local state within the network and to specify simple forms of concurrent execution. The compiler and runtime system cooperate to implement Pleiades programs efficiently and reliably. First, the compiler employs a novel program analysis to translate Pleiades programs into message-efficient units of work implemented in nesC. The Pleiades runtime system orchestrates execution of these units, using TinyOS services, across a network of sensor nodes. Second, the compiler and runtime system employ novel locking, deadlock detection, and deadlock recovery algorithms that guarantee serializability in the face of concurrent execution. We illustrate the readability, reliability and efficiency benefits of the Pleiades language through detailed experiments, and demonstrate that the Pleiades implementation of a realistic application performs similar to a hand-coded nesC version that contains more than ten times as much code.
The field of quantum algorithms is vibrant. Still, there is currently a lack of programminglanguages for describing quantum computation on a practical scale, i.e., not just at the level of toy problems. We address th...
详细信息
ISBN:
(纸本)9781450320146
The field of quantum algorithms is vibrant. Still, there is currently a lack of programminglanguages for describing quantum computation on a practical scale, i.e., not just at the level of toy problems. We address this issue by introducing Quipper, a scalable, expressive, functional, higher-order quantum programminglanguage. Quipper has been used to program a diverse set of non-trivial quantum algorithms, and can generate quantum gate representations using trillions of gates. It is geared towards a model of computation that uses a classical computer to control a quantum device, but is not dependent on any particular model of quantum hardware. Quipper has proven effective and easy to use, and opens the door towards using formal methods to analyze quantum algorithms.
We describe race-free properties of a hardware description language called GEZEL. The language describes networks of cycle-true finite-state-machines with datapaths (FSMDs). We derive a set of four rules under which a...
详细信息
We propose a technique to efficiently search a large family of abstractions in order to prove a query using a parametric dataflow analysis. Our technique either finds the cheapest such abstraction or shows that none e...
详细信息
ISBN:
(纸本)9781450320146
We propose a technique to efficiently search a large family of abstractions in order to prove a query using a parametric dataflow analysis. Our technique either finds the cheapest such abstraction or shows that none exists. It is based on counterexample-guided abstraction refinement but applies a novel meta-analysis on abstract counterexample traces to efficiently find abstractions that are incapable of proving the query. We formalize the technique in a generic framework and apply it to two analyses: a type-state analysis and a thread-escape analysis. We demonstrate the effectiveness of the technique on a suite of Java benchmark programs.
暂无评论