In this study, we aim to optimize Hadoop parameters to improve the performance of BioPig on Amazon Web Service (AWS). BioPig is a toolkit for large-scale sequencing data analysis and is built on Hadoop and Pig that en...
详细信息
The technical inflection points in the route to Exascale and the existing talent gap in Computational Science and HPC are well publicized. In this paper is described the development of a xMOOC format course with the a...
详细信息
In order to reduce the complexity of traditional multithreaded parallel programming,this paper explores a new task-based parallel programming using the *** Task parallel Library(TPL).Firstly,this paper proposes a cu...
详细信息
In order to reduce the complexity of traditional multithreaded parallel programming,this paper explores a new task-based parallel programming using the *** Task parallel Library(TPL).Firstly,this paper proposes a custom data partitioning optimization method to achieve an efficient data parallelism,and applies it to the matrix *** result of the application supports the custom data partitioning optimization *** we develop a task parallel application:Image Blender,and this application explains the efficiency and pitfall aspects associated with task ***,the paper analyzes the performance of our *** results show that TPL can dramatically alleviate programmer burden and boost the performance of programs with its taskbased parallel programming mechanism.
In this paper, we present a novel approach towards providing compiler generated runtime means for dynamic adaptation. The key novelty of the proposed solution is a complete separation between the runtime adaptation me...
详细信息
In order to reduce the complexity of traditional multithreaded parallel programming, this paper explores a new task-based parallel programming using the Microsoft .NET Task parallel Library (TPL). Firstly, this paper ...
详细信息
ISBN:
(纸本)9781479932801
In order to reduce the complexity of traditional multithreaded parallel programming, this paper explores a new task-based parallel programming using the Microsoft .NET Task parallel Library (TPL). Firstly, this paper proposes a custom data partitioning optimization method to achieve an efficient data parallelism, and applies it to the matrix multiplication. The result of the application supports the custom data partitioning optimization method. Then we develop a task parallel application: Image Blender, and this application explains the efficiency and pitfall aspects associated with task parallelism. Finally, the paper analyzes the performance of our applications. Experiments results show that TPL can dramatically alleviate programmer burden and boost the performance of programs with its task-based parallel programming mechanism.
Seismic wavefront simulation is a common method to understand the composi- tion of earth below the surface, especially for hydrocarbon exploration. One of these simulation methods is the wavefront construction algorit...
详细信息
Seismic wavefront simulation is a common method to understand the composi- tion of earth below the surface, especially for hydrocarbon exploration. One of these simulation methods is the wavefront construction algorithm. In this thesis, we re- duced the load imbalance in a parallel implementation of the wavefront construction algorithm. We added a generic redistribution framework for data structures in the C++ parallel library STAPL. We present a redistribution algorithm for the paral- lel wavefront construction application which uses the recursive coordinate bisection method to find a near-optimal data distribution of the data. This algorithm lever- aged the added redistribution features in STAPL to improve the running time of our application. We compared the run time of the application with and without redis- tribution on different geophysics models. We show that the proposed redistribution provides up to 9.45x speedup on a Cray XE6m cluster and 11.85x speedup on an IBM BlueGene/Q cluster.
Currently, the need to learn parallel applications topics in students has become an important issue due to the rapid growth in the parallel computing field. In fact, this topic has been included in Computer Science cu...
详细信息
Currently, the need to learn parallel applications topics in students has become an important issue due to the rapid growth in the parallel computing field. In fact, this topic has been included in Computer Science curriculum, but students present difficulties to design MPI parallel applications efficiently. We present a novel methodology for teaching parallel programming centered on improving parallel applications written by students through their experiences obtained during classes. The methodology integrates theoretical and practical sections which are focused on teaching two parallel paradigms, master/Worker and SPMD. These paradigms were selected due to their different communication and computation behaviors, which generate challenges for students when they wish to improve performance application metrics. Our methodology allows students to discover their own errors and how to correct them. In addition, students analyze the issues and advantages in the application designed in order to enhance the performance metrics. Applying this methodology gave us a significant progress in parallel applications designed by students, where we have observed an improvement of around 47% in the students’ skill about parallel programming when they design parallel applications.
In parallel systems, memory consistency models and cache coherence protocols establish the rules specifying which values will be visible to each instruction of parallel programs. Despite their central importance, veri...
详细信息
ISBN:
(纸本)9781450340342
In parallel systems, memory consistency models and cache coherence protocols establish the rules specifying which values will be visible to each instruction of parallel programs. Despite their central importance, verifying their correctness has remained a major challenge, due both to informal or incomplete specifications and to difficulties in scaling verification to cover their operations comprehensively. While coherence and consistency are often specified and verified independently at an architectural level, many systems implement performance enhancements that tightly interweave coherence and consistency at a microarchitectural level in ways that make verification of consistency difficult. This paper introduces CCICheck, a tool and technique supporting static verification of the coherenceconsistency interface (CCI). CCICheck enumerates and checks families of microarchitectural happens-before (mu hb) graphs that describe how a particular coherence protocol combines with a particular processors pipelines and memory hierarchy to enforce the requirements of a given consistency model. To support tractable CCI verification, CCICheck introduces the ViCL (Value in Cache Lifetime), an abstraction which allows the mu hb graphs to cleanly represent CCI events relevant to consistency verification, including demand fetching, cache line invalidation, coherence protocol windows of vulnerability, and partially incoherent cache hierarchies. We implement CCICheck as an automated tool and demonstrate its use on a number of case studies. We also show its tractability across a wide range of litmus tests.
Higher order functions provide an elegant way to express algorithms designed for implementation in hardware [1, 6-9]. By showing examples of both classic and new algorithms, I will explain why higher order functions d...
详细信息
ISBN:
(纸本)9781450336697
Higher order functions provide an elegant way to express algorithms designed for implementation in hardware [1, 6-9]. By showing examples of both classic and new algorithms, I will explain why higher order functions deserve to be studied. Next, I will consider the extent to which ideas from functional programming, and associated formal verification methods, have influenced hardware design in practice [3-5, 10]. What can we learn from looking back? You might ask "Why are methods of hardware design still important to our community?". Maybe we should just give up? One reason for not giving up is that hardware design is really a form of parallel programming. And here there is still a lot to do! Inspired by Blelloch's wonderful invited talk at ICFP 2010 [2], I still believe that functional programming has much to offer in the central question of how to program the parallel machines of today, and, more particularly, of the future. I will briefly present some of the areas where I think that we are poised to make great contributions. But maybe we need to work harder on getting our act together?
In this paper, parallel processing techniques are employed to improve the performance of the stochastic dynamic programming applied to the long term operation planning of electrical power system. The hydroelectric pla...
详细信息
In this paper, parallel processing techniques are employed to improve the performance of the stochastic dynamic programming applied to the long term operation planning of electrical power system. The hydroelectric plants are grouped into energy equivalent reservoirs and the expected cost functions are modeled by a piecewise linear approximation, by means of the Convex Hull algorithm. In order to validate the proposed methodology, data from the Brazilian electrical power system is utilized. (C) 2013 Elsevier B.V. All rights reserved.
暂无评论