We employ probabilistic causality analysis to study the performance data of 301 students from the upper-level undergraduate parallel programming class at the University of Central Florida. To our surprise, we discover...
详细信息
ISBN:
(纸本)9781538655559
We employ probabilistic causality analysis to study the performance data of 301 students from the upper-level undergraduate parallel programming class at the University of Central Florida. To our surprise, we discover that good performance in our lower-level undergraduate programming CS-1 and CS-II classes is not a significant causal factor that contributed to good performance in our parallel programming class. On the other hand, good performance in systems classes like Operating Systems, Information Security, Computer Architecture, Object Oriented Software and Systems Software coupled with good performance in theoretical classes like Introduction to Discrete Structures, Artificial Intelligence and Discrete Structures-II are strong indicators of good performance in our upper-level undergraduate parallel programming class. We believe that such causal analysis may be useful in identifying whether parallel and distributed computing concepts have effectively penetrated the lower-level computer science classes at an institution.
A hash function maps an arbitrary length of (longer) message into a fixed length of shorter string, called message digest. Inevitably there will be a lot of different messages being hashed to the same or similar diges...
详细信息
ISBN:
(纸本)9781509015405
A hash function maps an arbitrary length of (longer) message into a fixed length of shorter string, called message digest. Inevitably there will be a lot of different messages being hashed to the same or similar digest. We call this collision or partial collision. By utilizing multiple processors from the CUNY High Performance Computing Center's facility, we locate partial collisions for MD5 and SHA-1 by brute force parallel programming in C with MPI library. The brute force method of finding a second preimage collision entails systematically computing all of the permutations, digests, and Hamming distances of the target preimage. We explore varying size target strings and the number of processors allocation and examine the effect these variables have on finding partial collisions. The results show that for the same message space the search time for the partial collisions is roughly halved for each doubling of the number of processors;and the longer the message is the better partial collisions are produced.
We study how the concept of generic programming using C++ templates, realized in the Standard Template Library (STL), can be efficiently exploited in the specific domain of parallel programming. We present our approac...
详细信息
ISBN:
(纸本)3540221190
We study how the concept of generic programming using C++ templates, realized in the Standard Template Library (STL), can be efficiently exploited in the specific domain of parallel programming. We present our approach, implemented in the DatTeL data-parallel library, which allows simple programming for various parallel architectures while staying within the paradigm of classical C++ template programming. The novelty of the DatTeL is the use of higher-order parallel constructs, skeletons, in the STL-context and the easy extensibility of the library with new, domain-specific skeletons. We describe the principles of our approach based on skeletons, and explain our design decisions and their implementation in the library. The presentation is illustrated with a case study - the parallelization of a generic algorithm for carry-lookahead addition. We compare the DatTeL to related work and report both absolute performance and speedups achieved for the case study on parallel machines with shared and distributed memory.
With the current prevalence of multi-core processors in SMP cluster architectures, mixed-mode programming, using both MPI and OpenMP in the same application, is becoming increasingly important. In this paper we discus...
详细信息
ISBN:
(纸本)9781315684895;9781138028142
With the current prevalence of multi-core processors in SMP cluster architectures, mixed-mode programming, using both MPI and OpenMP in the same application, is becoming increasingly important. In this paper we discuss three methods for the parallelization of such algorithms, namely pure MPI parallelization, fine-grain hybrid MPI/OpenMP parallelization, and coarse-grain MPI/OpenMP parallelization. We propose a new hybrid parallel programming method based on architecture hierarchy on SMP cluster. We designed a hierarchical parallel algorithm on the N-body problem, and compare its performance with the traditional hybrid parallel algorithm on the Dawning 5000A cluster. The results indicate that the hierarchical hybrid parallel algorithm has better scalability and speed.
We exploited the recent advances in Internet connectivity and Web technologies for building Web-based parallel programming environments (WPPEs) that facilitate the development and execution of parallel programs on rem...
详细信息
ISBN:
(纸本)0818681187
We exploited the recent advances in Internet connectivity and Web technologies for building Web-based parallel programming environments (WPPEs) that facilitate the development and execution of parallel programs on remote high-performance computers. A Web browser running on the user's machine provides a user-friendly interface to sewer-site user accounts and allows the use of parallel computing platforms and software in a convenient manner. The user may create, edit, and execute files through this Web browser interface. This new Web-based client-sewer architecture has the potential of being used as a future front-end to high-performance computer systems. We discuss the design and implementation of several prototype WPPEs that are currently in use at the Northeast parallel Architectures Center and the Cornell Theory Center These initial prototypes support high-level parallel programming with Fortran 90 and Nigh Performance Fortran (HPF), as well as explicit tow-level programming with Message Passing Interface (MPI). We detail the lessons learned during the development process and outline the tradeoffs of various design choices in the realization of the design. We especially concentrate on providing sewer-site user accounts, mechanisms to access those accounts through the Web, and the Web-related system security issues.
parallel programming is notoriously difficult. This becomes even more critical as multicore processors bring parallel computing into the mainstream. In order to ease the difficulty, tools have been designed that help ...
详细信息
ISBN:
(纸本)9781424452910
parallel programming is notoriously difficult. This becomes even more critical as multicore processors bring parallel computing into the mainstream. In order to ease the difficulty, tools have been designed that help the programmer with some aspects of parallelisation. Unfortunately, the programmer is mostly left along when it comes to the difficult task of dependence analysis among the subtasks to be executed concurrently. This paper presents a new visual tool that supports the programmer with the dependence analysis in loops. This is very useful in combination with an automatically parallelising compiler or when loops are parallelised with OpenMP. The tool displays on-the-fly the dependences between the statements of the loop nest on which the developer is currently working. To maximise the usefulness of the tool, it is unobtrusive, customisable and flexible, and based on dependence analysis theory. A prototype was implemented for the Eclipse IDE as a plug-in that seamlessly integrates into the normal development process. The evaluation of the tool, including an evaluation against cognitive dimensions, demonstrates the usability and usefulness of the tool.
Since parallel programming is much more complex and difficult than sequential programming, it is more challenging to achieve the same software quality in a parallel context. High-level parallel programming models, if ...
详细信息
ISBN:
(纸本)9783030700058;9783030700065
Since parallel programming is much more complex and difficult than sequential programming, it is more challenging to achieve the same software quality in a parallel context. High-level parallel programming models, if implemented as software frameworks, could increase productivity and reliability. Important requirements such as extensibility and adaptability for different platforms are required for such a framework, and this paper reflects on these requirements and their relation to the software engineering methodologies that could put them in practice. All these are exemplified on a Java framework - JPLF;this is a high-level parallel programming approach being based on the model brought by the PowerLists associated theories, and it respects the analysed requirements. The design of JPLF is analysed by explaining the design choices and highlighting the design patterns and design principles applied.
Many curricula for undergraduate studies in computer science provide a lecture on the fundamentals of parallel programming like multi-threaded computation on shared memory architectures using POSIX threads or OpenMP. ...
详细信息
ISBN:
(纸本)9783319273082;9783319273075
Many curricula for undergraduate studies in computer science provide a lecture on the fundamentals of parallel programming like multi-threaded computation on shared memory architectures using POSIX threads or OpenMP. The complex structure of parallel programs can be challenging, especially for inexperienced students. Thus, there is a latent need for software supporting the learning process. Subsequent lectures may cover more advanced parallelization techniques such as the Message Passing Interface (MPI) and the Compute Unified Device Architecture (CUDA) languages. Unfortunately, the majority of students cannot easily access MPI clusters or modern hardware accelerators in order to effectively develop parallel programming skills. To overcome this, we present an interactive tool to aid both educators and students in the learning process. This paper describes the "System for AUtomated Code Evaluation" (SAUCE), a web-based open source (available under the AGPL-3.0 license at https://***/moschlar/SAUCE) application for programming assignment evaluation and elaborates on its features specifically designed for the teaching of parallel programming. This tool enables educators to provide the required programming environments with a low barrier to entry since it is usable with just a web browser. SAUCE allows for immediate feedback and thus can be used interactively in class room settings.
In the past, the tenacious semiconductor problems of operating temperature and power consumption limited the performance growth for single-core microprocessors. Microprocessor vendors hence adopt the multicore chip or...
详细信息
In the past, the tenacious semiconductor problems of operating temperature and power consumption limited the performance growth for single-core microprocessors. Microprocessor vendors hence adopt the multicore chip organizations with parallel processing because the new technology promises faster and lower power needed. In a short time, this trend floods first the development of CPU, then also the other peripherals like GPU. Modern GPUs are very efficient in manipulating computer graphics, and their highly parallel structure makes them even more effective than general-purpose CPUs for a range of graphical complex algorithms. However, technology of multicore processor brought revolution and unavoidable collision to the programming personnel. Multicore processor has high performance;however, parallel processing brings not only the opportunity but also a challenge. The issue of efficiency and the way how programmer or compiler parallelizes the software explicitly are the keys that enhance the performance on multicore chip. In this paper, we propose a parallel programming approach using hybrid CUDA, OpenMP, and MPI programming. There would be two verificational experiments presented in the paper. In the first, we would verify the availability and correctness of the auto-parallel tools, and discuss the performance issues on CPU, GPU, and embedded system. In the second, we would verify how the hybrid programming could surely improve performance. Copyright (C) 2016 John Wiley & Sons, Ltd.
Deterministic parallelism has become an increasingly attractive concept: a deterministic parallel program may be easier to construct, debug, understand, and maintain. However, there exist many different definitions of...
详细信息
ISBN:
(纸本)9783642240997
Deterministic parallelism has become an increasingly attractive concept: a deterministic parallel program may be easier to construct, debug, understand, and maintain. However, there exist many different definitions of "determinism" for parallel programming. Many existing definitions have not yet been fully formalized, and the relationships among these definitions are still unclear. We argue that formalism is needed, and that history-based semantics as used, for example, to define the Java and C++ memory models provides a useful lens through which to view the notion of determinism. As a first step, we suggest several history-based definitions of determinism. We discuss some of their comparative advantages, prove containment relationships among them, and identify programming idioms that ensure them. We also propose directions for future work.
暂无评论