A well-known problem in designing high-level parallel programming models and languages is the "granularity problem", where the execution of parallel task instances that are too fine-grain incur large overhea...
详细信息
ISBN:
(纸本)9783540852605
A well-known problem in designing high-level parallel programming models and languages is the "granularity problem", where the execution of parallel task instances that are too fine-grain incur large overheads in the parallel run-time and decrease the speed-up achieved by parallel execution. On the other hand, tasks that are too coarse-grain create load-imbalance and do not adequately utilize the parallel machine. In this work we attempt to address this issue with a concept of expressing "composable computations" in a parallel programming model called "Capsules". Such composability allows adjustment of execution granularity at run-time. In Capsules, we provide a unifying framework that allows composition and adjustment of granularity for both data and computation over iteration space and computation space. We show that this concept not only allows the user to express the decision on granularity of execution, but also the decision on the granularity of garbage collection, and other features that may be supported by the programming model. We argue that this adaptability of execution granularity leads to efficient parallel execution by matching the available application concurrency to the available hardware concurrency, thereby reducing parallelization overhead. By matching. we refer to creating coarse-grain Computation Capsules, that encompass multiple instances of fine-grain computation instances. In effect, creating coarse-grain computations reduces overhead by simply reducing the number of parallel computations. This leads to: (1) Reduced synchronization cost such as for blocked searches in shared data-structures;(2) Reduced distribution and scheduling cost for parallel computation instances;and (3) Reduced book-keeping cost maintain data-structures such as for unfulfilled data requests. Capsules builds on our prior work, TStreams, a data-flow oriented parallel programming framework. Our results on an SMP machine using the Cascade Face Detector, and the Stereo Visi
The most visible facet of the Computationally-Oriented Display Environment (CODE) is its graphical interface. However, the most important fact about CODE is that it is a programming system based on a formal unified co...
详细信息
The advancement of computer technology and the increasing complexity of research problems are creating the need to teach parallel programming in higher education more effectively. In this paper we present StarHPC, a s...
详细信息
ISBN:
(纸本)9789537138158
The advancement of computer technology and the increasing complexity of research problems are creating the need to teach parallel programming in higher education more effectively. In this paper we present StarHPC, a system solution that supports teaching parallel programming in courses at the Massachusetts Institute of Techology. StarHPC prepackages a virtual machine image used by students, the scripts used by an administrator, and a virtual image of the Amazon Elastic Computing Cloud (EC2) machine used to build the cluster shared by the class. This architecture coupled with the no-cost availability of StarHPC allows it to be deployed at other institutions interested in teaching parallel programming with a dedicated compute cluster without incurring large upfront or ongoing costs.
parallel programming represents the next turning point in how software engineers write software. Multicore processors can be found today in the heart of supercomputers, desktop computers and laptops. Consequently, app...
详细信息
ISBN:
(纸本)9780769534329
parallel programming represents the next turning point in how software engineers write software. Multicore processors can be found today in the heart of supercomputers, desktop computers and laptops. Consequently, applications will increasingly need to be parallelized to fully exploit multicore processors throughput gains now becoming available. Unfortunately, writing parallel code is more complex than writing serial code. This is where the Threading Building Blocks (TBB) approach enters the parallel computing picture. TBB helps developers create multithreaded applications more easily by using high-level abstractions to hide much of the complexity of parallel programming. We study the programmability and performance of TBB by evaluating several practical applications. The results show very promising performance but parallel programming with TBB is still tedious and error-prone.
Concurrent programming tools strive to exploit hardware resources as much as possible. Nonetheless, the lack of high level abstraction of such tools often require from the user a reasonable amount of knowledge in orde...
详细信息
ISBN:
(纸本)9781467386210
Concurrent programming tools strive to exploit hardware resources as much as possible. Nonetheless, the lack of high level abstraction of such tools often require from the user a reasonable amount of knowledge in order to achieve satisfactory performance requirements as well as they do not prevent error prone situations. In this paper we present Kanga, a framework based on the abstractions of skeletons to provide a generic tool that encapsulate many common parallel patterns. Through two case studies we validate the framework implementation.
This paper proposes a data parallel programming model suitable for loosely synchronous, irregular applications. At the core of the model are distributed objects that express non-trivial data parallelism. Sequential ob...
详细信息
ISBN:
(纸本)0769517455
This paper proposes a data parallel programming model suitable for loosely synchronous, irregular applications. At the core of the model are distributed objects that express non-trivial data parallelism. Sequential objects express independent computations. The goal is to use objects to fold synchronization into data accesses and thus, free the user from concurrency aspects. Distributed objects encapsulate large data partitioned across multiple address spaces. The system classifies accesses to distributed objects as read and write. Furthermore, it uses the access patterns to maintain information about dependences across partitions. The system guarantees inter-object consistency using a relaxed update scheme. Typical access patterns uncover dependences for data on the border between partitions. Experimental results show that this approach is highly usable and efficient.
This paper presents general concepts of Object Oriented parallel Processing;also comparing two of the most widely used OOPP techniques. PVM (parallel Virtual Machine) and MPI (Message Passing Interface) and introduces...
详细信息
ISBN:
(纸本)078037505X
This paper presents general concepts of Object Oriented parallel Processing;also comparing two of the most widely used OOPP techniques. PVM (parallel Virtual Machine) and MPI (Message Passing Interface) and introduces the SCOOP (SCalable Object Oriented programming), approach to support the design and execution of parallel applications. As parallel programming tools are progressively being adopted, parallel applications are becoming less platform independent. PVM and AM are tools that have enabled portable parallel programming. Portability and platform independence is of prime importance in parallel programming as parallel processes simultaneously execute on potentially different platforms. Key factors affecting the performance of parallel applications on a target platform are parallelism, granularity. load balancing and scalability. The SCOOP system is a step forward in the development of techniques for dynamic granularity control, applied to parallel OO languages.
We have designed and implemented a Grid RPC system called OmniRPC, for parallel programming in cluster and grid environments. While OmniRPC inherits its AN from Ninf, the programmer can use OpenMP for easy-to-use para...
详细信息
ISBN:
(纸本)0769519199
We have designed and implemented a Grid RPC system called OmniRPC, for parallel programming in cluster and grid environments. While OmniRPC inherits its AN from Ninf, the programmer can use OpenMP for easy-to-use parallel programming because the API is designed to be thread-safe. To support typical master-worker grid applications such as a parametric execution, OmniRPC provides an automatic-initializable remote module to send and store data to a remote executable invoked in the remote host. Since it may accept several requests for subsequent calls by keeping the connection alive, the data set by the initialization is re-used, resulting in efficient execution by reducing the amount of communication. The OmniRPC system also supports a local environment with "rsh", a grid environment with Globus, and remote hosts with "ssh". Furthermore, the user can use the same program over OmniRPC for both clusters and grids because a typical grid resource is regarded simply as a cluster of clusters distributed geographically. For a cluster over a private network, an agent process running the server host functions as a proxy to relay communications between the client and the remote executables by multiplexing the communications into one connection to the client. This feature allows a single client to use a thousand of remote computing hosts.
In this tutorial participants learn how to build their own parallel programming language features by developing them as language extensions in the ableC [4] extensible C compiler framework. By implementing new paralle...
详细信息
ISBN:
(纸本)9781450362252
In this tutorial participants learn how to build their own parallel programming language features by developing them as language extensions in the ableC [4] extensible C compiler framework. By implementing new parallel programming abstractions as language extensions one can build on an existing host language and thus avoid re-implementing common language features such as the type checking and code generation of arithmetic expressions and control flow statements. Using ableC, one can build expressive language features that fit seamlessly into the C11 host language.
parallel programming has rapidly moved from a special-purpose technique to standard practice. This newfound ubiquity needs to be matched by improved parallel programming education. As parallel programming involves hig...
详细信息
ISBN:
(纸本)9781538655559
parallel programming has rapidly moved from a special-purpose technique to standard practice. This newfound ubiquity needs to be matched by improved parallel programming education. As parallel programming involves higher level concepts, students tend to struggle with turning the abstract information into concrete mental models. Analogies are known to aid in this knowledge transfer, by providing an existing schema as the basis for the formation of a new schema. Additionally, technology has been proven to increase motivation and engagement in students, which ultimately improves learning. Combining these ideas, this paper presents several contributions that enhance aspects of parallel programming education. These contributions include a set of collaborative learning activities to target fundamental scheduling concepts, a detailed analogy to assist in the understanding of the scheduling concepts, and an augmented reality application to facilitate the collaborative learning activity by bringing the analogy to life.
暂无评论