检索结果-内蒙古大学图书馆

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

作者： McCool, Michael D. Intel

Many-core processors target improved computational performance by making available various forms of architectural parallelism, including but not limited to multiple cores and vector instructions. However, approaches to parallel programming based on targeting these low-level parallel mechanisms directly leads to overly complex, non-portable, and often unscalable and unreliable code. A more structured approach to designing and implementing parallel algorithms is useful to reduce the complexity of developing software for such processors, and is particularly relevant for many-core processors with a large amount of parallelism and multiple parallelism mechanisms. In particular, efficient and reliable parallel programs can be designed around the composition of deterministic algorithmic skeletons, or patterns. While improving the productivity of experts, specific patterns and fused combinations of patterns can also guide relatively inexperienced users to developing efficient algorithm implementations that have good scalability. The approach to parallelism described in this document includes both collective "data-parallel" patterns such as map and reduce as well as structured "task-parallel" patterns such as pipelining and superscalar task graphs. The structured pattern based approach, like data-parallel models, addresses issues of both data access and parallel task distribution in a common framework. Optimization of data access is important for both many-core processors with shared memory systems and accelerators with their own memories not directly attached to the host processor. A catalog of useful structured serial and parallel patterns will be presented. Serial patterns are presented because structured parallel programming can be considered an extension of structured control flow in serial programming. We will emphasize deterministic patterns in order to support the development of systems that automatically avoid unsafe race conditions and deadlock. © HotPar 2010.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Compilation Techniques for High Level parallel Code

引用

INTERNATIONAL JOURNAL OF parallel programming 2010年第1期38卷 4-18页

作者： Gaster, Benedict R. Bainbridge, Tim Lacey, David Gardner, David AMD Sunnyvale CA USA ClearSpeed Technol Plc Bristol Avon England XMOS Semicond Bristol Avon England

This paper describes methods to adapt existing optimizing compilers for sequential languages to produce code for parallel processors. In particular it looks at targeting data-parallel processors using SIMD (single instruction multiple data) or vector processors where users need features similar to high-level control flow across the data-parallelism. The premise of the paper is that we do not want to write an optimizing compiler from scratch. Rather, a method is described that allows a developer to take an existing compiler for a sequential language and modify it to handle SIMD extensions. As well as modifying the front-end, the intermediate representation and the code generation to handle the parallelism, specific optimizations are described to target the architecture efficiently.

关键词： parallel programming Compilers Optimization

来源：评论

学校读者我要写书评

暂无评论

HotPar 2010 - 2nd USENIX workshop on Hot Topics in parallelism

HotPar 2010 - 2nd USENIX Workshop on Hot Topics in Paralleli...

引用

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

The proceedings contain 16 papers. The topics discussed include: towards parallelizing the layout engine of Firefox;synchronization via scheduling: managing shared state in video games;separating functional and parallel correctness using nondeterministic sequential specifications;user-defined distributions and layouts in chapel: philosophy and framework;opportunities and challenges of parallelizing speech recognition;task superscalar: using processors as functional units;OoOJava: an out-of-order approach to parallel programming;a balanced programming model for emerging heterogeneous multicore systems;and reflective parallel programming extensible and high-level control of runtime, compiler, and application interaction.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Gossamer: A lightweight programming framework for multicore machines 2

Gossamer: A lightweight programming framework for multicore ...

引用

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

作者： Roback, Joseph A. Andrews, Gregory R. Department of Computer Science University of Arizona TucsonAZ United States

来源：评论

学校读者我要写书评

暂无评论

OoOJava: An out-of-order approach to parallel programming 2

OoOJava: An out-of-order approach to parallel programming

引用

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

作者： Jenista, James C. Eom, Yong Hun Demsky, Brian

Developing parallel software using current tools can be challenging. Developers must reason carefully about the use of locks to avoid both race conditions and deadlocks. We present a compiler-assisted approach to parallel programming inspired by out-of-order hardware. In our approach, the developer annotates code blocks as reorderable to decouple these blocks from the parent thread of execution. OoOJava uses static analysis to extract all data dependences from both variables and data structures to generate an executable that is guaranteed to preserve the behavior of the original sequential code. We have implemented OoOJava and achieved significant speedups for a ray tracer and a K-Means cluster benchmark. The straightforward development model, compiler feedback, and speedups are promising indicators that a simple deterministic parallel programming model with strong guarantees can become mainstream. © HotPar 2010.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Three high performance architectures in the parallel APMC boat

Three high performance architectures in the parallel APMC bo...

引用

9th International workshop on parallel and Distributed Methods in Verification, PDMC 2010 - Joint with the 2nd International workshop on High-Performance Computational Systems Biology, HiBi 2010

作者： Hamidouche, Khaled Borghi, Alexandre Esterie, Pierre Falcou, Joel Peyronnet, Sylvain LRI CNRS Univ. Paris-Sud 91405 Orsay France

ISBN: (纸本)9780769542652

Approximate probabilistic model checking, and more generally sampling based model checking methods, proceed by drawing independent executions of a given model and by checking a temporal formula on these executions. In theory, these methods can be easily massively parallelized, but in practice one has to consider, for this purpose, important aspects such as the communication paradigm, the physical architecture of the machine, etc. Moreover, being able to develop multiple implementations of this algorithm on architectures as different as a cluster or many-cores requires various levels of expertise that may be problematic to gather. In this paper we propose to investigate the runtime behavior of approximate probabilistic model checking on various state of the art parallel machines - clusters, SMP, hybrid SMP clusters and the Cell processor - using a high-level parallel programming tool based on the Bulk Synchronous parallelism paradigm to quickly instantiate model checking problems over a large variety of parallel architectures. Our conclusion assesses the relative efficiency of these architectures with respect to the algorithm classes and promotes guidelines for further work on parallel APMC implementation. © 2010 IEEE.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Task superscalar: Using processors as functional units 2

Task superscalar: Using processors as functional units

引用

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

作者： Etsion, Yoav Ramirez, Alex Badia, Rosa M. Ayguade, Eduard Labarta, Jesus Valero, Mateo

The complexity of parallel programming greatly limits the effectiveness of chip-multiprocessors (CMPs). This paper presents the case for task superscalar pipelines, an abstraction of traditional out-of-order superscalar pipelines, that orchestrates an entire chip-multiprocessor in the same degree out-of-order pipelines manage functional units. Task superscalar leverages an emerging class of taskbased dataflow programming models to relieve programmers fromexplicitlymanaging parallel resources. We posit that task superscalar overcome many of the limitations of instruction-level out-of-order pipelines, and provide a scalable interface for CMPs. © HotPar 2010.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

A balanced programming model for emerging heterogeneous multicore systems 2

A balanced programming model for emerging heterogeneous mult...

引用

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

作者： Liu, Wei Lewis, Brian Zhou, Xiaocheng Chen, Hu Gao, Ying Yan, Shoumeng Luo, Sai Saha, Bratin Intel Corporation

Computer systems are moving towards a heterogeneous architecture with a combination of one or more CPUs and one or more accelerator processors. Such heterogeneous systems pose a new challenge to the parallel programming community. Languages such as OpenCL and CUDA provide a program environment for such systems. However, they focus on data parallel programming where the majority of computation is carried out by the accelerators. Our view is that, in the future, accelerator processors will be tightly coupled with the CPUs, be available in different system architectures (e.g., integrated and discrete), and systems will be dynamically reconfigurable. In this paper we advocate a balanced programming model where computation is balanced between the CPU and its accelerators. This model supports sharing virtual memory between the CPU and the accelerator processors so the same data structures can be manipulated by both sides. It also supports task-parallel as well as data-parallel programming, fine-grained synchronization, thread scheduling, and load balancing. This model not only leverages the computational capability of CPUs, but also allows dynamic system reconfiguration, and supports different platform configurations. To help demonstrate the practicality of our programming model, we present performance results for a preliminary implementation on a computer system with an Intel Core i7 processor and a discrete Larrabee processor. These results show that the model's most performance-critical part, its shared virtual memory implementation, simplifies programming without hurting performance. © HotPar 2010.

关键词： Program processors

来源：评论

学校读者我要写书评

暂无评论

Reflective parallel programming extensible and high-level control of runtime, compiler, and application interaction 2

Reflective parallel programming extensible and high-level co...

引用

2nd USENIX workshop on Hot Topics in parallelism, HotPar 2010

作者： Matsakis, Nicholas D. Gross, Thomas R. ETH Zurich Switzerland

Thread support in most languages is opaque and low-level. Primitives like wait and signal do not allow users to determine the relative ordering of statements in different threads in advance. In this paper, we extend the reflection and metaprogramming facilities of object-oriented languages to cover parallel program schedules. The user can then access objects representing the extant threads or other parallel tasks. These objects can be used to modify or query happens before relations, locks, and other highlevel scheduling information. These high-level models enable users to design their own parallel abstractions, visualizers, safety checks, and other tools in ways that are not possible today. We discuss one implementation of this technique, the intervals library, and show how the presence of a firstclass, queryable program schedule allows us to support a flexible data race protection scheme. The scheme supports both static and dynamic checks and also permits users to define their own "pluggable" safety checks based on the reflective model of the program schedule. © HotPar 2010.

关键词： Object oriented programming

来源：评论

学校读者我要写书评

暂无评论

Asynchronous Stream Processing with S-Net

引用

INTERNATIONAL JOURNAL OF parallel programming 2010年第1期38卷 38-67页

作者： Grelck, Clemens Scholz, Sven-Bodo Shafarenko, Alex Univ Amsterdam Inst Informat NL-1098 XG Amsterdam Netherlands Univ Hertfordshire Dept Comp Sci Hatfield AL10 9AB Herts England

We present the rationale and design of S-Net, a coordination language for asynchronous stream processing. The language achieves a near-complete separation between the application code, written in any conventional programming language, and the coordination/communication code written in S-Net. Our approach supports a component technology with flexible software reuse. No extension of the conventional language is required. The interface between S-Net and the application code is in terms of one additional library function. The application code is componentised and presented to S-Net as a set of components, called boxes, each encapsulating a single tuple-to-tuple function. Apart from the boxes defined using an external compute language, S-Net features two built-in boxes: one for network housekeeping and one for data-flow style synchronisation. Streaming network composition under S-Net is based on four network combinators, which have both deterministic and nondeterministic versions. Flexible software reuse is comprehensive, with the box interfaces and even the network structure being subject to subtyping. We propose an inheritance mechanism, named flow inheritance, that is specifically geared towards stream processing. The paper summarises the essential language constructs and type concepts and gives a short application example.

关键词： Component system Coordination language Stream processing Record subtyping Declarative multicore programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：