Prevalent hardware trends towards parallel architectures and algorithms create a growing demand for graduate students familiar with the programming of concurrent software. However, learning parallel programming is cha...
详细信息
Prevalent hardware trends towards parallel architectures and algorithms create a growing demand for graduate students familiar with the programming of concurrent software. However, learning parallel programming is challenging due to complex communication and memory access patterns as well as the avoidance of common pitfalls such as dead-locks and race conditions. Hence, the learning process has to be supported by adequate software solutions in order to enable future computer scientists and engineers to write robust and efficient code. This paper discusses a selection of well-known parallel algorithms based on C++11 threads, OpenMP, MPI, and CUDA that can be interactively embedded in an HPC or parallel computing lecture using a unified framework for the automated evaluation of source code-namely the "System for AUtomated Code Evaluation" (SAUCE). SAUCE is free software licensed under AGPL-3.0 and can be downloaded at https://***/moschlar/SAUCE free of charge. (C) 2017 Elsevier Inc. All rights reserved.
The ParaScope project is developing an integrated collection of tools to help scientific programmers implement correct and efficient parallel programs. The centerpiece of this collection is the ParaScope Editor, an in...
详细信息
The ParaScope project is developing an integrated collection of tools to help scientific programmers implement correct and efficient parallel programs. The centerpiece of this collection is the ParaScope Editor, an intelligent interactive editor for parallel Fortran programs. The ParaScope Editor reveals to users potential hazards of a proposed parallelization in a program. It also provides a variety of powerful interactive program transformations that have been shown useful in converting programs to parallel form. In addition, the ParaScope Editor supports general user editing through a hybrid text and structure editing facility that incrementally analyzes the modified program for potential hazards. The ParaScope Editor is a new kind of program construction tool-one that not only manages text, but also presents the user with information about the correctness of the parallel program under development. As such, it can support an exploratory programming style in which users get immediate feedback on their various strategies for parallelization.
In this work we present Lithium, a pure Java structured parallel programming environment based on skeletons (common, reusable and efficient parallelism exploitation patterns). Lithium is implemented as a Java package ...
详细信息
In this work we present Lithium, a pure Java structured parallel programming environment based on skeletons (common, reusable and efficient parallelism exploitation patterns). Lithium is implemented as a Java package and represents both the first skeleton based programming environment in Java and the first complete skeleton based Java environment exploiting macro-data flow implementation techniques. Lithium supports a set of user code optimizations which are based on skeleton rewriting techniques. These optimizations improve both absolute performance and resource usage with respect to original user code. parallel programs developed using the library run on any network of workstations provided the workstations support plain JRE. The paper describes the library implementation, outlines the optimization techniques used and eventually presents the performance results obtained on both synthetic and real applications. (C) 2002 Elsevier Science B.V. All rights reserved.
There are many paradigms available to address the unique and complex problems introduced with parallel programming. These complexities have implications for computer science education as ubiquitous multi-core computer...
详细信息
There are many paradigms available to address the unique and complex problems introduced with parallel programming. These complexities have implications for computer science education as ubiquitous multi-core computers drive the need for programmers to understand parallelism. One major obstacle to student learning of parallel programming is that there is very little human factors evidence comparing the different techniques to one another, so there is no clear direction on which techniques should be taught and how. We performed a randomized controlled trial using 88 university-level computer science student participants performing three identical tasks to examine the question of whether or not there are measurable differences in programming performance between two paradigms for concurrent programming: threads compared to process-oriented programming based on Communicating Sequential Processes. We measured both time on task and programming accuracy using an automated token accuracy map (TAM) technique. Our results showed trade-offs between the paradigms using both metrics and the TAMs provided further insight about specific areas of difficulty in comprehension.
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Langua...
详细信息
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading) introduces a user-defined coarse-grained parallel data-flow model, where programmers identify code blocks, called super-instructions, to be run in parallel and connect them in a data-flow graph. TALM has been implemented as a hybrid Von Neumann/data-flow execution system: the Trebuchet. We have observed that TALM's usefulness largely depends on how programmers specify and connect super-instructions. Thus, we present Couillard, a full compiler that creates, based on an annotated C-program, a data-flow graph and C-code corresponding to each super-instruction. We show that our toolchain allows one to benefit from data-flow execution and explore sophisticated parallel programming techniques, with small effort. To evaluate our system we have executed a set of real applications on a large multi-core machine. Comparison with popular parallel programming methods shows competitive speedups, while providing an easier parallel programing approach. More specifically, for an application that follows the wavefront method, running with big inputs, Trebuchet achieved up to 4.7% speedup over Intel (R) TBB novel flow-graph approach and up to 44% over OpenMP. (C) 2014 Elsevier B.V. All rights reserved.
Sisal 3.2 is a new input language of system of functional programming (SFP) which is under development at the Institute of Informatics Systems in Novosibirsk as an interactive visual environment for supporting of scie...
详细信息
Sisal 3.2 is a new input language of system of functional programming (SFP) which is under development at the Institute of Informatics Systems in Novosibirsk as an interactive visual environment for supporting of scientific parallel programming. This paper contains an overview of Sisal 3.2 and a description of its new features compared with previous versions of the SFP input language such as the multidimensional array support, new abstractions like parametric types and generalised procedures, more flexible user-defined reductions, improved interoperability with other programming languages and specification of several optimising source text annotations.
The NAS parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite containing several benchmarks extracted from Computational Fluid Dynamics (CFD) models. The benchmark suite has impo...
详细信息
The NAS parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite containing several benchmarks extracted from Computational Fluid Dynamics (CFD) models. The benchmark suite has important characteristics such as intensive memory communications, complex data dependencies, different memory access patterns, and hardware components/sub-systems overload. parallel programming APIs, libraries, and frameworks that are written in C++ as well as new optimizations and parallel processing techniques can benefit if NPB is made fully available in this programming language. In this paper we present NPB-CPP, a fully C++ translated version of NPB consisting of all the NPB kernels and pseudo-applications developed using OpenMP, Intel TBB, and FastFlow parallel frameworks for multicores. The design of NPB-CPP leverages the Structured parallel programming methodology (essentially based on parallel design patterns). We show the structure of each benchmark application in terms of composition of few patterns (notably Map and MapReduce constructs) provided by the selected C++ frameworks. The experimental evaluation shows the accuracy of NPB-CPP with respect to the original NPB source code. Furthermore, we carefully evaluate the parallel performance on three multi-core systems (Intel, IBM Power, and AMD) with different C++ compilers (gcc, icc, and clang) by discussing the performance differences in order to give to the researchers useful insights to choose the best parallel programming framework for a given type of problem. (C) 2021 Elsevier B.V. All rights reserved.
An effective data-parallel programming environment will use a variety of tools that support the development of efficient data-parallel programs while insulating the programmer from the intricacies of the explicitly pa...
详细信息
An effective data-parallel programming environment will use a variety of tools that support the development of efficient data-parallel programs while insulating the programmer from the intricacies of the explicitly parallel code.
This paper presents EasyPAP, an easy-to-use programming environment designed to help students to learn parallel programming. EasyPAPfeatures a wide range of 2D computation kernels that the students are invited to para...
详细信息
This paper presents EasyPAP, an easy-to-use programming environment designed to help students to learn parallel programming. EasyPAPfeatures a wide range of 2D computation kernels that the students are invited to parallelize using Pthreads, OpenMP, OpenCL or MPI. Execution of kernels can be interactively visualized, and powerful monitoring tools allow students to observe both the scheduling of computations and the assignment of 2D tiles to threads/processes. By focusing on algorithms and data distribution, students can experiment with diverse code variants and tune multiple parameters, resulting in richer problem exploration and faster progress towards efficient solutions. We present selected lab assignments which illustrate howEasyPAPimproves the way students explore parallel programming. (C) 2021 Elsevier Inc. All rights reserved.
暂无评论