parallel programming and distributed programming involve substantial amounts of boilerplate code for process management and data synchronisation. This leads to increased bug potential and often results in unintended n...
详细信息
parallel programming and distributed programming involve substantial amounts of boilerplate code for process management and data synchronisation. This leads to increased bug potential and often results in unintended non-deterministic program behaviour. Moreover, algorithmic details are mixed with technical details concerning parallelisation and distribution. Process calculi are formal models for parallel and distributed programming but often leave details open, causing a gap between formal model and implementation. We propose a fully deterministic process calculus for parallel and distributed programming and implement it as a domain-specific language in Haskell to address these problems. We eliminate boilerplate code by abstracting from the exact notion of parallelisation and encapsulating it in the implementation of our process combinators. Furthermore, we achieve correctness guarantees regarding process composition at compile time through Haskell's type system. Our result can be used as a high-level tool to implement parallel and distributed programs.
parallel programming can be extremely challenging. programming models have been proposed to simplify this task, but wide acceptance of these remains elusive for many reasons, including the demand for greater accessibi...
详细信息
parallel programming can be extremely challenging. programming models have been proposed to simplify this task, but wide acceptance of these remains elusive for many reasons, including the demand for greater accessibility and productivity. In this paper, we introduce a parallel programming model and framework called CharmPy, based on the Python language. CharmPy builds on Charm++, and runs on top of its C++ runtime. It presents several unique features in the form of a simplified model and API, increased flexibility, and the ability to write everything in Python. CharmPy is a high-level model based on the paradigm of distributed migratable objects. It retains the benefits of the Charm++ runtime, including dynamic load balancing, asynchronous execution model with automatic overlap of communication and computation, high performance, and scalability from laptops to supercomputers. By being Python-based, CharmPy also benefits from modern language features, access to popular scientific computing and data science software, and interoperability with existing technologies like C, Fortran and OpenMP. To illustrate the simplicity of the model, we will show how to implement a distributed parallel map function based on the Master-Worker pattern using CharmPy, with support for asynchronous concurrent jobs. We also present performance results running stencil code and molecular dynamics mini-apps fully written in Python, on Blue Waters and Cori supercomputers. For stencil3d, we show performance similar to an equivalent MPI-based program, and significantly improved performance for imbalanced computations. Using Numba to JIT-compile the critical parts of the code, we show performance for both mini-apps similar to the equivalent C++ code.
C++ was originally designed as a sequential programming language. For development of multithreaded applications, libraries, such as Pthreads, Windows threads, and Boost, are traditionally used. The C++11 standard intr...
详细信息
C++ was originally designed as a sequential programming language. For development of multithreaded applications, libraries, such as Pthreads, Windows threads, and Boost, are traditionally used. The C++11 standard introduced some basic concepts and means for developing parallel and concurrent programs, but the direct use of these low-level means requires high programming skills and significant efforts. The absence of high-level models of parallelism in C++ is somewhat compensated for by various parallel libraries and directive parallelization tools (such as OpenMP), as well as by language extensions supported by some compilers (Intel CilkPlus). Nevertheless, we still require more advanced means to express parallelism in programs at the level of language standard and language library. In this survey, we consider the means for parallel and concurrent programming that are included into the C++17 standard, as well as some capabilities that are to be expected in the future standards.
The ability to teach parallel programming principles and techniques is becoming fundamental to prepare a new generation of programmers able to master the pervasive parallelism made available by hardware vendors. Class...
详细信息
The ability to teach parallel programming principles and techniques is becoming fundamental to prepare a new generation of programmers able to master the pervasive parallelism made available by hardware vendors. Classical parallel programming courses leverage either low-level programming frameworks (e.g. those based on Pthreads) or higher level frameworks such as OpenMP or MPI. We discuss our teaching experience within the Master in "Computer Science and networking" where parallel programming is taught leveraging structured parallel programming principles and frameworks. The paper summarizes the results achieved in eight years of experience and shows how the adoption of a structured parallel programming approach improves the efficiency of the teaching process.
parallel programming techniques have been prominently explored in various engineering applications as it provides a time efficient solution to the complex problems without affecting the accuracy. parallel programming ...
详细信息
ISBN:
(数字)9781538683330
ISBN:
(纸本)9781538683347
parallel programming techniques have been prominently explored in various engineering applications as it provides a time efficient solution to the complex problems without affecting the accuracy. parallel programming approach for solving complex electromagnetic problems with the requirement of huge computational power has been a thrust area of research. This paper focuses on shared memory parallel programming model for RCS estimation of electrically large structures such as cylindrical duct and a solid cylinder with surfaces of concave and convex in nature. In this paper OpenMP API and parallel computing toolbox in MATLAB are utilized to simulate a parallel program on a shared memory architecture. An exponential decrease in computation time has been through a classical RCS estimation problem.
EFL is an embedded language, which allows parallel programs to be written using the new Flexible Algorithms' approach to parallel programming, which has an imperative programming style, and ensures lock-free deter...
详细信息
ISBN:
(纸本)9781538649282
EFL is an embedded language, which allows parallel programs to be written using the new Flexible Algorithms' approach to parallel programming, which has an imperative programming style, and ensures lock-free determinism. Hammock cost techniques have been proven useful in project management, reducing much of projects' costs and resources' use - close to one billion Euros were saved in the Westerscheldetunnel project. Much more, Hammock activities are useful in many other implementations of project management like sub-projects and outsourcing issues. In this research, we are proposing to implement the intrinsically parallel Hammock cost techniques by using EFL's parallel programming paradigm. We believe that this will allow better project overall performance.
Benchmarking is a way to study the performance of new architectures and parallel programming frameworks. Well-established benchmark suites such as the NAS parallel Benchmarks (NPB) comprise legacy codes that still lac...
详细信息
Benchmarking is a way to study the performance of new architectures and parallel programming frameworks. Well-established benchmark suites such as the NAS parallel Benchmarks (NPB) comprise legacy codes that still lack portability to C++ language. As a consequence, a set of high-level and easy-to-use C++ parallel programming frameworks cannot be tested in NPB. Our goal is to describe a C++ porting of the NPB kernels and to analyze the performance achieved by different parallel implementations written using the Intel TBB, OpenMP and FastFlow frameworks for Multi-Cores. The experiments show an efficient code porting from Fortran to C++ and an efficient parallelization on average.
This paper evaluates the power consumption of different parallel programming interfaces (PPI) in a multicore architecture. These PPIs are: PThreads, OpenMP, MPI-1 and MPI-2 (spawn). We measure the total energy and exe...
详细信息
ISBN:
(纸本)9781728137735
This paper evaluates the power consumption of different parallel programming interfaces (PPI) in a multicore architecture. These PPIs are: PThreads, OpenMP, MPI-1 and MPI-2 (spawn). We measure the total energy and execution time of 11 applications in a single architecture, varying the number of threads/processes. The goal is to show that these applications can be used as a parallel benchmark to evaluate the power consumption of different PPIs. The results show that PThreads has the lowest power consumption among the interfaces, consuming less than the sequential version for memory-bound applications.
In the context of a learning game to teach parallel programming, we describe a procedural content generation (PCG) approach that can be controlled to generate programming puzzles involving a desired set of concepts, a...
详细信息
ISBN:
(纸本)9781450353199
In the context of a learning game to teach parallel programming, we describe a procedural content generation (PCG) approach that can be controlled to generate programming puzzles involving a desired set of concepts, and of desired size and "difficulty". Our approach is based on grammars to control the generation of the puzzle structure, and orthographic graph embedding techniques to render it into a two-dimensional grid for our game. The proposed PCG system is designed to work with a player model in order to provide personalized learning experiences. We present an evaluation of the variability of the generated puzzles using several metrics including challenge and solvability as evaluated by a custom-build model checker. Our evaluation shows that this PCG system can generate a large number of varied puzzles but it is still not able to generate puzzles with certain aesthetic and functional qualities found in puzzles generated by human authors.
暂无评论