Many-core processors target improved computational performance by making available various forms of architectural parallelism, including but not limited to multiple cores and vector instructions. However, approaches t...
详细信息
This paper describes methods to adapt existing optimizing compilers for sequential languages to produce code for parallel processors. In particular it looks at targeting data-parallel processors using SIMD (single ins...
详细信息
This paper describes methods to adapt existing optimizing compilers for sequential languages to produce code for parallel processors. In particular it looks at targeting data-parallel processors using SIMD (single instruction multiple data) or vector processors where users need features similar to high-level control flow across the data-parallelism. The premise of the paper is that we do not want to write an optimizing compiler from scratch. Rather, a method is described that allows a developer to take an existing compiler for a sequential language and modify it to handle SIMD extensions. As well as modifying the front-end, the intermediate representation and the code generation to handle the parallelism, specific optimizations are described to target the architecture efficiently.
The proceedings contain 16 papers. The topics discussed include: towards parallelizing the layout engine of Firefox;synchronization via scheduling: managing shared state in video games;separating functional and parall...
The proceedings contain 16 papers. The topics discussed include: towards parallelizing the layout engine of Firefox;synchronization via scheduling: managing shared state in video games;separating functional andparallel correctness using nondeterministic sequential specifications;user-defined distributions and layouts in chapel: philosophy and framework;opportunities and challenges of parallelizing speech recognition;task superscalar: using processors as functional units;OoOJava: an out-of-order approach to parallelprogramming;a balanced programming model for emerging heterogeneous multicore systems;and reflective parallelprogramming extensible and high-level control of runtime, compiler, and application interaction.
Developing parallel software using current tools can be challenging. Developers must reason carefully about the use of locks to avoid both race conditions and deadlocks. We present a compiler-assisted approach to para...
详细信息
Approximate probabilistic model checking, and more generally sampling based model checking methods, proceed by drawing independent executions of a given model and by checking a temporal formula on these executions. In...
详细信息
The complexity of parallelprogramming greatly limits the effectiveness of chip-multiprocessors (CMPs). This paper presents the case for task superscalar pipelines, an abstraction of traditional out-of-order superscal...
详细信息
Computer systems are moving towards a heterogeneous architecture with a combination of one or more CPUs and one or more accelerator processors. Such heterogeneous systems pose a new challenge to the parallel programmi...
详细信息
Thread support in most languages is opaque and low-level. Primitives like wait and signal do not allow users to determine the relative ordering of statements in different threads in advance. In this paper, we extend t...
详细信息
We present the rationale and design of S-Net, a coordination language for asynchronous stream processing. The language achieves a near-complete separation between the application code, written in any conventional prog...
详细信息
We present the rationale and design of S-Net, a coordination language for asynchronous stream processing. The language achieves a near-complete separation between the application code, written in any conventional programming language, and the coordination/communication code written in S-Net. Our approach supports a component technology with flexible software reuse. No extension of the conventional language is required. The interface between S-Net and the application code is in terms of one additional library function. The application code is componentised and presented to S-Net as a set of components, called boxes, each encapsulating a single tuple-to-tuple function. Apart from the boxes defined using an external compute language, S-Net features two built-in boxes: one for network housekeeping and one for data-flow style synchronisation. Streaming network composition under S-Net is based on four network combinators, which have both deterministic and nondeterministic versions. Flexible software reuse is comprehensive, with the box interfaces and even the network structure being subject to subtyping. We propose an inheritance mechanism, named flow inheritance, that is specifically geared towards stream processing. The paper summarises the essential language constructs and type concepts and gives a short application example.
暂无评论