This paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses...
详细信息
ISBN:
(纸本)9781424452910
This paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses views to bundle mutual exclusion with data access. We have implemented the data race prevention scheme with a memory protection mechanism. Experimental results show that the extra overhead of memory protection is trivial in our applications. We also present a new VOPP implementation-Maotai 2.0, which has advanced features such as deadlock avoidance, producer/consumer view and system queues, in addition to the data race prevention scheme. The performance of Maotai 2.0 is evaluated and compared with modern programming models such as OpenMP and Cilk.
The Moore's law has reached its limitation since the integration and economic issues of CPU. Therefore, the trend of chip design is moving to the increase of the number of cores rather than stressing the density o...
详细信息
ISBN:
(纸本)9781538643426
The Moore's law has reached its limitation since the integration and economic issues of CPU. Therefore, the trend of chip design is moving to the increase of the number of cores rather than stressing the density of circuits. Consequently, the parallel programming is attracting interests. In this trend, functional languages are getting popular for parallel programming since they have inherent parallelism. This paper aims to compare and analyze two Haskell programming models for many-core environment. We developed applications based on a Haskell parallel programming model named Eval monad and Cloud Haskell, respectively to compare the performance of them. We test the application on both 32 cores and 120 cores CPU. The experimental result shows that on 32 cores, the performances are similar, but on the 120 cores, Cloud Haskell performs 32% faster on run-time, and 123% better on scalability. This result implies that Cloud Haskell is more appropriate for a large number of cores than Eval monad, and the latter is more suitable for simple parallelism involving just tens of cores.
Nowadays many fields of science and engineering are evolving by the joint contribution of complementary fields. Computer science, and especially high performance computing, has become a key factor in the development o...
详细信息
ISBN:
(纸本)9783319273082;9783319273075
Nowadays many fields of science and engineering are evolving by the joint contribution of complementary fields. Computer science, and especially high performance computing, has become a key factor in the development of many research fields, establishing a new paradigm called computational science. Researchers and professionals from many different fields require a knowledge of high performance computing, including parallel programming, to develop a fruitful work in their particular field. So, at Universitat Autonoma of Barcelona, an interdisciplinary master on Modeling for science and engineering was started 5 years ago to provide a deep knowledge on the application of modeling and simulation to graduate students on different fields (Mathematics, Physics, Chemistry, Engineering, Geology, etc.). In this master, parallel programming appears as a compulsory subject, because it is a key topic for them. The concepts learnt in parallel programming must be applied to real applications. Therefore, a subject on Applied Modelling and Simulation has also been included. In this paper, the experience on teaching parallel programming in such interdisciplinary master is shown.
Understanding the performance behavior of parallel applications is important in many ways, but doing so is not easy. Most open source analysis tools are written for the command line. We are building on these proven to...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Understanding the performance behavior of parallel applications is important in many ways, but doing so is not easy. Most open source analysis tools are written for the command line. We are building on these proven tools to provide an interactive performance analysis experience within Jupyter Notebooks when developing parallel code with MPI, OpenMP, or both. Our solution makes it possible to measure the execution time, perform profiling and tracing, and visualize the results within the notebooks. For ease of use, it provides both a graphical JupyterLab extension and a C++ API. The JupyterLab extension shows a dialog where the user can select the type of analysis and its parameters. Internally, this tool uses Score -P, Scalasca, and Cube to generate profiling and tracing data. This tight integration gives students easy access to profiling tools and helps them better understand concepts such as benchmarking, scalability and performance bottlenecks. In addition to the technical development, the article presents hands-on exercises from our well-established parallel programming course. We conclude with a qualitative and quantitative evaluation with 19 students, which shows a positive effect of the tools on the students' perceived competence.
Today parallel computing is essential for the success of many real-world applications and software systems. Nonetheless, most computer science undergraduate courses teach students how to think and program sequentially...
详细信息
ISBN:
(纸本)9781538655559
Today parallel computing is essential for the success of many real-world applications and software systems. Nonetheless, most computer science undergraduate courses teach students how to think and program sequentially. Further, software professionals have complained about the computer science curriculum's lag behind industry in their failing to cover modern programming technologies such as parallel programming. The emphasis on parallel programming has become even more important due to the increasing adoption of horizontal scaling approaches to cope with massive datasets. In order to help students coming from a serial curriculum comprehend parallel concepts, we used an innovative approach that utilized active learning, visualizations, examples, discussions, and practical exercises. Further, we conducted an experiment to examine the effect of active learning on students' understanding of parallel programming. Results indicate that the students that were actively engaged with the material performed better in terms of understanding parallel programming concepts than other students.
The use of key parallel-programming patterns has proved to be extremely helpful for mastering difficult concurrent and parallel programming concepts and the associated syntactical constructs. The method suggested here...
详细信息
ISBN:
(数字)9783319751788
ISBN:
(纸本)9783319751788;9783319751771
The use of key parallel-programming patterns has proved to be extremely helpful for mastering difficult concurrent and parallel programming concepts and the associated syntactical constructs. The method suggested here consists of a substantial change of more traditional teaching and learning approaches to teach programming According to our approach, students are first introduced to concurrency problems through a selected set of preliminar program code-patterns. Each pattern also has a series of tests with selected samples to enable students to discover the most common cases that cause problems and then the solutions to be applied. In addition, this paper presents the results obtained from an informal assessment realized by the students of a course on concurrent and real-time programming that belongs to the computer engineering (CE) degree. The obtained results show that students feel now to be more actively involved in lectures, practical lessons, and thus students make better use of their time and gain a better understanding of concurrency topics that would not have been considered possible before the proposed method was implemented at our University.
This paper deals with the use of Beowulf clusters in higher education. The study and evaluation of a Beowulf cluster at the Department of Electronics Engineering of the Technological Educational Institute of Athens is...
详细信息
ISBN:
(纸本)142440049X
This paper deals with the use of Beowulf clusters in higher education. The study and evaluation of a Beowulf cluster at the Department of Electronics Engineering of the Technological Educational Institute of Athens is described. The design methodologies, the performance measurements and the experiments will be discussed. This work enabled undergraduate and postgraduate students to study parallel systems and formed a perspective to introduce a parallel computing module in the undergraduate curriculum.
In this paper, we present GASPARD (Graphical Arra ray Specification for parallel and Distributed computing) which is our visual programming environment devoted to the development of parallel applications. Task and dat...
详细信息
ISBN:
(纸本)0769517315
In this paper, we present GASPARD (Graphical Arra ray Specification for parallel and Distributed computing) which is our visual programming environment devoted to the development of parallel applications. Task and data parallelism paradigm of parallel computing are mixed in GASPARD to achieve a simple programming interface. We use the printed circuit metaphor The programmer specifies tasks and instantiates by plugging them into a slot (task parallelism). Data parallelism is achieved by specifying the data the task uses. By mixing textual and visual programming, we achieve convenient interface useful for scientific programming. The interface is also well suited for meta-computing deployment. This kind of programming is very useful for numerical simulation.
Since the appearance of parallel processors and their rapid diversification across a broad spectrum, developers must phrase algorithms in a parallel manner using originally imperative and thus inappropriate high-level...
详细信息
ISBN:
(纸本)9781479986705
Since the appearance of parallel processors and their rapid diversification across a broad spectrum, developers must phrase algorithms in a parallel manner using originally imperative and thus inappropriate high-level languages. Language extensions as well as highly complex debugging methods (e.g., profilers) to handle concurrent and non-deterministic execution are therefore continuously developed. Most tools, however, suffer from inflexibility and platform dependencies. Moreover, binary-instrumenting profilers involve high overhead, influencing and thus deforming the runtime behavior. This may even hide critical behavior, thus developers still rely on their experience and often manually include measures in their software-code (in-line profiling). In this work, we propose a platform independent abstraction layer enabling a unified parallelization and runtime-flexible choice of the actual parallelization framework (e.g., OpenMP, TBB). Based on a source-code aware point of view, we further introduce an automated in-line profiling methodology in order to allow an objective rating of the parallelization success. Moreover, we automatically extract runtime influencing aspects and exemplarily apply these methodologies to implementations of two different video-based driver-assistance algorithms considering two different processor types.
The Python programming language has established itself as a popular alternative for implementing scientific computing workflows. Its massive adoption across a wide spectrum of disciplines has created a strong communit...
详细信息
ISBN:
(数字)9783031238215
ISBN:
(纸本)9783031238208;9783031238215
The Python programming language has established itself as a popular alternative for implementing scientific computing workflows. Its massive adoption across a wide spectrum of disciplines has created a strong community that develops tools for solving complex problems in science and engineering. In particular, there are several parallel programming libraries for Python codes that target multicore processors. We aim at comparing the performance and scalability of a subset of three popular libraries (Multiprocessing, PyMP, and Torcpy). We use the Particle-in-cell (PIC) method as a benchmark. This method is an attractive option for understanding physical phenomena, specially in plasma physics. A pre-existing PIC code implementation was modified to integrate Multiprocessing, PyMP, and Torcpy. The three tools were tested on a manycore and on a multicore processor by running different problem sizes. The results obtained consistently indicate that PyMP has the best performance, Multiprocessing showed a similar behavior but with longer execution times, and Torcpy did not properly scale when increasing the number of workers. Finally, a just-in-time (JIT) alternative was studied by using Numba, showing execution time reductions of up to 43%.
暂无评论