programming paradigms are designed to express algorithms elegantly and efficiently. There are many parallel programming paradigms, each suited to a certain class of problems. Selecting the best parallel programming pa...
详细信息
ISBN:
(纸本)9783540897392
programming paradigms are designed to express algorithms elegantly and efficiently. There are many parallel programming paradigms, each suited to a certain class of problems. Selecting the best parallel programming paradigm for a. problem minimizes programming effort and maximizes performance. Given the increasing complexity of parallel applications, no one paradigm may be suitable for all components of an application. Today, most parallel scientific applications are programmed with a single paradigm and the challenge of multi-paradigm parallel programming remains unmet in the broader community. We believe that each component, of a parallel program should be programmed using the most suitable paradigm. Furthermore, it is not sufficient to simply bolt modules together: programmers should be able to switch between paradigms easily, and resource management across paradigms should be automatic. We present a pre-existing adaptive run-time system (ARTS) and show how it can be used to meet these challenges by allowing the simultaneous use of multiple parallel programming paradigms and supporting resource management across all of them. We discuss the implementation of some common paradigms within the ARTS and demonstrate the use of multiple paradigms within our feature-rich unstructured mesh framework. We show how this approach boosts performance and productivity for an application developed using this framework.
We present a domain specific language, embedded in Haskell, for general purpose parallel programming on GPUs. Our intention is to explore the use of connection patterns in parallel programming. We briefly present our ...
详细信息
ISBN:
(纸本)9783642244513
We present a domain specific language, embedded in Haskell, for general purpose parallel programming on GPUs. Our intention is to explore the use of connection patterns in parallel programming. We briefly present our earlier work on hardware generation, and outline the current state of GPU architectures and programming models. Finally, we present the current status of the Obsidian project, which aims to make GPU programming easier, without relinquishing detailed control of GPU resources. Both a programming example and some details of the implementation are presented. This is a report on work in progress.
The most visible facet of the Computationally-Oriented Display Environment (CODE) is its graphical interface. However, the most important fact about CODE is that it is a programming system based on a formal unified co...
详细信息
A well-known problem in designing high-level parallel programming models and languages is the "granularity problem", where the execution of parallel task instances that are too fine-grain incur large overhea...
详细信息
ISBN:
(纸本)9783540852605
A well-known problem in designing high-level parallel programming models and languages is the "granularity problem", where the execution of parallel task instances that are too fine-grain incur large overheads in the parallel run-time and decrease the speed-up achieved by parallel execution. On the other hand, tasks that are too coarse-grain create load-imbalance and do not adequately utilize the parallel machine. In this work we attempt to address this issue with a concept of expressing "composable computations" in a parallel programming model called "Capsules". Such composability allows adjustment of execution granularity at run-time. In Capsules, we provide a unifying framework that allows composition and adjustment of granularity for both data and computation over iteration space and computation space. We show that this concept not only allows the user to express the decision on granularity of execution, but also the decision on the granularity of garbage collection, and other features that may be supported by the programming model. We argue that this adaptability of execution granularity leads to efficient parallel execution by matching the available application concurrency to the available hardware concurrency, thereby reducing parallelization overhead. By matching. we refer to creating coarse-grain Computation Capsules, that encompass multiple instances of fine-grain computation instances. In effect, creating coarse-grain computations reduces overhead by simply reducing the number of parallel computations. This leads to: (1) Reduced synchronization cost such as for blocked searches in shared data-structures;(2) Reduced distribution and scheduling cost for parallel computation instances;and (3) Reduced book-keeping cost maintain data-structures such as for unfulfilled data requests. Capsules builds on our prior work, TStreams, a data-flow oriented parallel programming framework. Our results on an SMP machine using the Cascade Face Detector, and the Stereo Visi
The advancement of computer technology and the increasing complexity of research problems are creating the need to teach parallel programming in higher education more effectively. In this paper we present StarHPC, a s...
详细信息
ISBN:
(纸本)9789537138158
The advancement of computer technology and the increasing complexity of research problems are creating the need to teach parallel programming in higher education more effectively. In this paper we present StarHPC, a system solution that supports teaching parallel programming in courses at the Massachusetts Institute of Techology. StarHPC prepackages a virtual machine image used by students, the scripts used by an administrator, and a virtual image of the Amazon Elastic Computing Cloud (EC2) machine used to build the cluster shared by the class. This architecture coupled with the no-cost availability of StarHPC allows it to be deployed at other institutions interested in teaching parallel programming with a dedicated compute cluster without incurring large upfront or ongoing costs.
In the paper there is discussed problem of use semaphores to solve parallel programming problems. There is proposed new semaphore mechanism of extended features. An example of application of such a semaphore for contr...
详细信息
ISBN:
(纸本)0769517315
In the paper there is discussed problem of use semaphores to solve parallel programming problems. There is proposed new semaphore mechanism of extended features. An example of application of such a semaphore for controlling a technological process is introduced and discussed Simulation results and advantages of applying the new mechanism are presented.
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maxim...
详细信息
ISBN:
(纸本)9781595937711
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithmlevel parallelism, express their parallel algorithms by asserting high-level properties on top of a traditional sequential programming language, and rely on parallelizing compilers and hardware support to perform parallel execution under the hood. In such a model, compilers and related tools require much more advanced program analysis capabilities and programmer assertions than what are currently available so that a comprehensive understanding of the input program's concurrency can be derived. Such an understanding is then used to drive automatic or interactive parallel code generation tools for a diverse set of parallel hardware organizations. The chip-level architecture and hardware should maintain parallel execution state in such a way that a strictly sequential execution state can always be derived for the purpose of verifying and debugging the program. We argue that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: Shared memory nodes with several multi-core CPUs are connected via a network infrastructure. parallel programming must com...
详细信息
ISBN:
(纸本)9780769535449
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: Shared memory nodes with several multi-core CPUs are connected via a network infrastructure. parallel programming must combine distributed memory parallelization on the node interconnect with shared memory parallelization inside each node. We describe potentials and challenges of the dominant programming models on hierarchically structured hardware: Pure MPI (Message Passing Interface), pure OpenMP (with distributed shared memory extensions) and hybrid MPI+OpenMP in several flavors. We pinpoint cases where a hybrid programming model can indeed be the superior solution because of reduced communication needs and memory consumption, or improved load balance. Furthermore we show that machine topology has a significant impact on performance for all parallelization strategies and that topology awareness should be built into all applications in the future. Finally we give an outlook on possible standardization goals and extensions that could make hybrid programming easier to do with performance in mind.
This contribution presents an MPI-based parallel computational framework for simulation and gradient-based structural optimization of geometrically nonlinear and large-scale structural finite element models. The field...
详细信息
ISBN:
(纸本)9781905088423
This contribution presents an MPI-based parallel computational framework for simulation and gradient-based structural optimization of geometrically nonlinear and large-scale structural finite element models. The field of computational structural analysis is gaining more and more importance in product design. In order to obtain an impression about possible approaches of design improvement already in an early planning phase, an efficient optimization tool is desired that requires only few modelling effort. This demand can be satisfied by a combined analysis and optimization tool working on the same model. To this purpose the finite element based optimization method is an excellent approach, leading to the highest possible diversity within the optimization process.
Teaching parallel programming to undergraduate CS students is a challenging task as many of the concepts are highly abstract and difficult to grasp. OpenMP is often used to simplify parallelization of programs by allo...
详细信息
ISBN:
(纸本)9781450390705
Teaching parallel programming to undergraduate CS students is a challenging task as many of the concepts are highly abstract and difficult to grasp. OpenMP is often used to simplify parallelization of programs by allowing one to incrementally parallelize using concise and expressive directives. Unfortunately, OpenMP is not available in Java natively. A basic support of OpenMP-like directives can, however, be obtained in Java using the Pyjama compiler and runtime. I report on my experience introducing parallel programming in Java with Pyjama in a small Data Structures class. The material is presented to students in the form of parallel programming patternlets embedded in an interactive notebook with which students can experiment. Formative and summative assessments of the module's effectiveness are performed. This pilot run of the module yielded mixed results, yet valuable insight was gained regarding possible future approaches.
暂无评论