This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maxim...
详细信息
ISBN:
(纸本)9781595937711
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithmlevel parallelism, express their parallel algorithms by asserting high-level properties on top of a traditional sequential programming language, and rely on parallelizing compilers and hardware support to perform parallel execution under the hood. In such a model, compilers and related tools require much more advanced program analysis capabilities and programmer assertions than what are currently available so that a comprehensive understanding of the input program's concurrency can be derived. Such an understanding is then used to drive automatic or interactive parallel code generation tools for a diverse set of parallel hardware organizations. The chip-level architecture and hardware should maintain parallel execution state in such a way that a strictly sequential execution state can always be derived for the purpose of verifying and debugging the program. We argue that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: Shared memory nodes with several multi-core CPUs are connected via a network infrastructure. parallel programming must com...
详细信息
ISBN:
(纸本)9780769535449
Today most systems in high-performance computing (HPC) feature a hierarchical hardware design: Shared memory nodes with several multi-core CPUs are connected via a network infrastructure. parallel programming must combine distributed memory parallelization on the node interconnect with shared memory parallelization inside each node. We describe potentials and challenges of the dominant programming models on hierarchically structured hardware: Pure MPI (Message Passing Interface), pure OpenMP (with distributed shared memory extensions) and hybrid MPI+OpenMP in several flavors. We pinpoint cases where a hybrid programming model can indeed be the superior solution because of reduced communication needs and memory consumption, or improved load balance. Furthermore we show that machine topology has a significant impact on performance for all parallelization strategies and that topology awareness should be built into all applications in the future. Finally we give an outlook on possible standardization goals and extensions that could make hybrid programming easier to do with performance in mind.
This contribution presents an MPI-based parallel computational framework for simulation and gradient-based structural optimization of geometrically nonlinear and large-scale structural finite element models. The field...
详细信息
ISBN:
(纸本)9781905088423
This contribution presents an MPI-based parallel computational framework for simulation and gradient-based structural optimization of geometrically nonlinear and large-scale structural finite element models. The field of computational structural analysis is gaining more and more importance in product design. In order to obtain an impression about possible approaches of design improvement already in an early planning phase, an efficient optimization tool is desired that requires only few modelling effort. This demand can be satisfied by a combined analysis and optimization tool working on the same model. To this purpose the finite element based optimization method is an excellent approach, leading to the highest possible diversity within the optimization process.
Teaching parallel programming to undergraduate CS students is a challenging task as many of the concepts are highly abstract and difficult to grasp. OpenMP is often used to simplify parallelization of programs by allo...
详细信息
ISBN:
(纸本)9781450390705
Teaching parallel programming to undergraduate CS students is a challenging task as many of the concepts are highly abstract and difficult to grasp. OpenMP is often used to simplify parallelization of programs by allowing one to incrementally parallelize using concise and expressive directives. Unfortunately, OpenMP is not available in Java natively. A basic support of OpenMP-like directives can, however, be obtained in Java using the Pyjama compiler and runtime. I report on my experience introducing parallel programming in Java with Pyjama in a small Data Structures class. The material is presented to students in the form of parallel programming patternlets embedded in an interactive notebook with which students can experiment. Formative and summative assessments of the module's effectiveness are performed. This pilot run of the module yielded mixed results, yet valuable insight was gained regarding possible future approaches.
The importance of high-performance computing (HPC) motivates an increasing number of students to study parallel programming. However, a major obstacle for students to learn parallel programming is the lack of large-sc...
详细信息
ISBN:
(纸本)9781538665220
The importance of high-performance computing (HPC) motivates an increasing number of students to study parallel programming. However, a major obstacle for students to learn parallel programming is the lack of large-scale computing resources and feedback for their programs. In this paper, we design and implement an online HPC educational programming system, which provides free computing resources for students and the support of multiple HPC programming languages. The students can easily write HPC programs on our platform and submit to the Tianhe-2 supercomputer for execution. The execution results will be returned and displayed on the front-end web browser of users. In addition, our system also supports code evaluation and feedback debugging via integrating mpiP and TAU into our system kernel. The evaluation results and feedback allow students to look into the execution details of their programs and further optimize their submitted programs.
The CO2P3S parallel programming system uses design patterns and object-oriented programming to reduce the complexities of parallel programming. The system generates correct frameworks from pattern template specificati...
详细信息
ISBN:
(纸本)1880446359
The CO2P3S parallel programming system uses design patterns and object-oriented programming to reduce the complexities of parallel programming. The system generates correct frameworks from pattern template specifications and provides a layered programming model to address both the problems of correctness and openness. This paper describes the highest level of abstraction in CO2P3S, using two example programs to demonstrate the programming model and the supported patterns. Further, we introduce phased parallel design patterns, a new class of patterns that allow temporal phase relationships in a parallel program to be specified, and provide two patterns in this class. Our results show that the frameworks can be used to quickly implement parallel programs, reusing sequential code where possible. The resulting parallel programs provide substantial performance gains over their sequential counterparts.
This paper presents AssistConf, a graphical user interface designed to configure an ASSIST program and to run it on a Grid platform. ASSIST (A Software development System based upon Integrated Skeleton Technology) is ...
详细信息
ISBN:
(纸本)0769518753
This paper presents AssistConf, a graphical user interface designed to configure an ASSIST program and to run it on a Grid platform. ASSIST (A Software development System based upon Integrated Skeleton Technology) is a new programming environment for the development of parallel and distributed high-performance applications. The main goals of ASSIST are allowing high-level programmability and software productivity for complex multidisciplinary applications, and performance portability across different platforms, including homogenous parallel machines and cluster/Beowulf systems, heterogeneous clusters, and computational Grids. AssistConf is used to configure the ASSIST program and establish a mapping between the program modules and the most suitable machines in the Grid candidate to execute them. It simplifies the creation of the XML ASSIST configuration file, giving users a graphical view of the XML file produced by the ASSIST compilation phase, and permitting an easy identification of the machines to be used for execution. Finally, the configuration file produced by AssistConf is used as input to the assistrun command, which drives the execution of the ASSIST program over the Grid.
This paper advocates a configuration approach to parallel programming for distributed memory multicomputers, in particular, arrays of transputers. The configuration approach prescribes the rigorous separation of the l...
详细信息
In this work, the implementation of an efficient multi-threading algorithm for calculating the power flow in electricity distribution networks is carried out using recursion and parallel programming. With the integrat...
详细信息
ISBN:
(纸本)9798350387032;9798350387025
In this work, the implementation of an efficient multi-threading algorithm for calculating the power flow in electricity distribution networks is carried out using recursion and parallel programming. With the integration of renewable energy, energy storage systems and distributed generation, the ability of power flow simulations becomes a crucial factor in finding the best solution in the shortest possible time. We propose the direct use of graph theory to represent distribution network topologies. In this data structure, the traversal algorithms are inherently recursive, thus enabling the development of algorithms with parallel programming to obtain the power flow calculation faster and more efficiently. Results under a 809 buses test system show that the implementation provides additional computation efficiency of 32% with recursion techniques and 27% with parallel programming, due the expense of threads' allocation the combined gain reaches 50%.
暂无评论