Composability is a key component to improve programmers' productivity in writing fast market-expanding applications such as parallel machine learning algorithms and big data analytics. These applications exhibit b...
详细信息
ISBN:
(纸本)9781728150208
Composability is a key component to improve programmers' productivity in writing fast market-expanding applications such as parallel machine learning algorithms and big data analytics. These applications exhibit both regular and irregular compute patterns, and are often combined with other functions or libraries to compose a larger program. However, composable parallel processing has taken a back seat in many existing parallel programming libraries, making it difficult to achieve modularity in large-scale parallel programs. In this paper, we introduce a new parallel task programming library using composable tasking graphs. Our library efficiently supports task parallelism together with an intuitive task graph construction and flexible execution API set to enable reusable and composable task dependency graphs. Developers can quickly compose a large parallel program from small and modular parallel building blocks, and easily deploy the program on a multicore machine. We have evaluated our library on real-world applications. Experimental results showed our library can achieve comparable performance to Intel Threading Building Blocks with less coding effort.
New sequencing technologies has been increasing the size of current genomes rapidly reducing its cost at the same time, those data need to be processed with efficient and innovated tools using high performance computi...
详细信息
ISBN:
(纸本)9783319665627;9783319665610
New sequencing technologies has been increasing the size of current genomes rapidly reducing its cost at the same time, those data need to be processed with efficient and innovated tools using high performance computing (HPC), but for taking advantage of nowadays supercomputers, parallel programming techniques and strategies have to be used. Plant genomes are full of Long Terminal Repeat Retrotransposons (LTR-RT), which are the most frequent repeated sequences;very important agronomical commodity such as Robusta Coffee and Maize have genomes that are composed by similar to 50% and similar to 85% respectively of this class of mobile elements, new parallel bioinformatics pipelines are making possible to use whole genomes like those in research projects, generating a lot of new information and impacting in many ways the knowledge that researchers have about them. Here we presented the utility of multi-core architectures and parallel programming for analyzing and classifying massive quantity of genomic information up to 16 times faster.
This paper presents the research project based methodology of teaching parallel programming to master's students in a High Performance Computing program. The requirements for completing a master's degree state...
详细信息
This paper presents the research project based methodology of teaching parallel programming to master's students in a High Performance Computing program. The requirements for completing a master's degree state that all students should be able to develop computer simulation programs using parallel and distributed computing technologies, regardless of students' background and their preferences for in-depth study of high or low-level programming, administration, and information security. Creating computer simulations based on high-performance computing is impossible without the experience of solving such key issues of low-level parallel programming as the data flow management, synchronization, load balancing and fault tolerance. We believe that the best way to explore these issues is phased implementation of appropriate algorithms in the application, and then carrying out computational experiments. Therefore, as a main tool for the practical study, we offer the implementation of special project tasks. While developing the course tasks, we have used not only our teaching experience of parallel programming for undergraduate and graduate students, but we also relied on the existing practice of the development of distributed computing systems. In addition to the classic tasks, students explored pairing algorithms, load balancing and fault tolerance through implementation in distributed applications and testing in computational experiments. Our experience has shown that this approach to teaching parallel programming, which includes modeling and simulations, enabled students to proceed gradually from classic tasks to the implementation of full-scale research projects.
In addition to large-scale computers, multicore processors have taken a significant part in all kinds of devices, from personal computers to cell phones. Although programming techniques for parallel systems exist for ...
详细信息
ISBN:
(数字)9789532330991
ISBN:
(纸本)9781728153391
In addition to large-scale computers, multicore processors have taken a significant part in all kinds of devices, from personal computers to cell phones. Although programming techniques for parallel systems exist for a while, the development of applications that can appropriately utilize multicores is still challenging in many aspects, especially for full exploitation of the computational resources. Additionally, another challenge is the efficient and easy programming of heterogeneous systems for the complete exploitation of silicon resources. Solutions to making parallel programming more developer-friendly are various programming models that abstract parallelism and concurrency. Implementations of those models need to extend even to lower layers of software parallelism and hardware parallelism as well. This paper gives an overview of parallel architectures and trending programming models for such processing units and systems. It also presents challenges to scalability and portability in parallel systems and presents up to date trends in heterogeneous systems that heavily exploit parallelism.
In this paper we present Cpp-Taskflow, a C++ parallel programming library that enables users to quickly develop parallel applications using the task dependency graph model. Developers formulate their application as a ...
详细信息
Nowadays, many fields of science and engineering are evolving through the joint contribution of complementary fields. Computer science, and especially High Performance Computing, has become a key factor in the develop...
详细信息
Nowadays, many fields of science and engineering are evolving through the joint contribution of complementary fields. Computer science, and especially High Performance Computing, has become a key factor in the development of many research fields, establishing a new paradigm called computational science. Researchers and professionals from many different fields require knowledge of High Performance Computing, including parallel programming, to develop fruitful and efficient work in their particular field. Therefore, at Universitat Autonoma of Barcelona (Spain), an interdisciplinary Master on "Modeling for Science and Engineering" was started 5 years ago to provide a thorough knowledge of the application of modeling and simulation to graduate students in different fields (Mathematics, Physics, Chemistry, Engineering, Geology, etc.). In this Master's degree, "parallel programming" appears as a compulsory subject because it is a key topic for them. The concepts learned in this subject must be applied to real applications. Therefore, a complementary subject on "Applied Modeling and Simulation" has also been included. It is very important to show the students how to analyze their particular problems, think about them from a computational perspective and consider the related performance issues. So, in this paper, the methodology and the experience in introducing computational thinking, parallel programming and performance engineering in this interdisciplinary Master's degree are shown. This overall approach has been refined through the Master's life, leading to excellent academic results and improving the industry and students appraisal of this programme. (C) 2017 Elsevier Inc. All rights reserved.
Prevalent hardware trends towards parallel architectures and algorithms create a growing demand for graduate students familiar with the programming of concurrent software. However, learning parallel programming is cha...
详细信息
Prevalent hardware trends towards parallel architectures and algorithms create a growing demand for graduate students familiar with the programming of concurrent software. However, learning parallel programming is challenging due to complex communication and memory access patterns as well as the avoidance of common pitfalls such as dead-locks and race conditions. Hence, the learning process has to be supported by adequate software solutions in order to enable future computer scientists and engineers to write robust and efficient code. This paper discusses a selection of well-known parallel algorithms based on C++11 threads, OpenMP, MPI, and CUDA that can be interactively embedded in an HPC or parallel computing lecture using a unified framework for the automated evaluation of source code-namely the "System for AUtomated Code Evaluation" (SAUCE). SAUCE is free software licensed under AGPL-3.0 and can be downloaded at https://***/moschlar/SAUCE free of charge. (C) 2017 Elsevier Inc. All rights reserved.
The use of programming patterns is considered to be a conceptual aid for programmers for developing understandable and testable concurrent and parallel code which is not only well built but also safe. By using program...
详细信息
The use of programming patterns is considered to be a conceptual aid for programmers for developing understandable and testable concurrent and parallel code which is not only well built but also safe. By using programming patterns and their implementations as computer programs, difficult new concepts can be smoothly taught in lectures to students who before trying this teaching approach would have been reluctant to enroll on parallel and Concurrent programming courses. The approach presented in this paper consists in changing the traditional programming teaching and learning model to one where students are first introduced to syntactical constructs through selected introductory program code-patterns. In the theory lessons that follow, through the use of laptops with multi-core processors and access to the Virtual Campus services of our university, the students are easily able to implement and master the new concepts as they are taught. This teaching experiment was implemented to teach a concurrent and real-time programming course which is part of the computer engineering (CE) degree and taught during the third semester of the CE curriculum. Evaluation of the students' academic performance when they had been taught with this approach revealed a 20.6% improvement in the students' end-of-course grades. (C) 2017 Elsevier Inc. All rights reserved.
Whilst there have been great advances in HPC hardware and software in recent years, the languages and models that we use to program these machines have remained much more static. This is not from a lack of effort, but...
详细信息
Whilst there have been great advances in HPC hardware and software in recent years, the languages and models that we use to program these machines have remained much more static. This is not from a lack of effort, but instead by virtue of the fact that the foundation that many programming languages are built on is not sufficient for the level of expressivity required for parallel work. The result is an implicit trade-off between programmability and performance which is made worse due to the fact that, whilst many scientific users are experts within their own fields, they are not HPC experts. Type oriented programming looks to address this by encoding the complexity of a language via the type system. Most of the language functionality is contained within a loosely coupled type library that can be flexibly used to control many aspects such as parallelism. Due to the high level nature of this approach there is much information available during compilation which can be used for optimisation and, in the absence of type information, the compiler can apply sensible default options thus supporting both the expert programmer and novice alike. We demonstrate that, at no performance or scalability penalty when running on up to 8196 cores of a Cray XE6 system, codes written in this type oriented manner provide improved programmability. The programmer is able to write simple, implicit parallel, HPC code at a high level and then explicitly tune by adding additional type information if required. (C) 2017 Elsevier Ltd. All rights reserved.
parallel programming has become increasingly popular in the computer educational field over the past few years. Although parallel programs obtain the short execution time and the high throughput, learning how to write...
详细信息
parallel programming has become increasingly popular in the computer educational field over the past few years. Although parallel programs obtain the short execution time and the high throughput, learning how to write a well-structured and high-performance parallel program is still one of the challenges for most of students. How to let students learn parallel programming well is one of the important tasks that educators should resolve. This paper presents the learning of parallel programming using software refactoring methodologies and tools. Manual and automated refactoring are introduced to show how the learning is improved respectively. With manual refactoring, students learn how to perform the data or task decomposition and how to write a well-structured parallel software via customized programs and some benchmarks in JGF benchmark suite;with automated refactoring, students can transform the parallel parts quickly, and then evaluate the performance of a parallel software. Two automated refactoring tools are developed for educational purposes. Some of the experiences are also shared during conducting the course. (C) 2017 Wiley Periodicals, Inc.
暂无评论