High Performance Fortran (HPF) offers an attractive high-level language interface for programming scalable parallel architectures providing the user with directives for the specification of data distribution and deleg...
详细信息
High Performance Fortran (HPF) offers an attractive high-level language interface for programming scalable parallel architectures providing the user with directives for the specification of data distribution and delegating to the compiler the task of generating an explicitly parallel program. Available HPF compilers can handle regular codes quite efficiently, but dramatic performance losses may be encountered for applications which are based on highly irregular, dynamically changing data structures and access patterns. In this paper we introduce the Vienna Fortran Compiler (VFC), a new source-to-source parallelization system for HPF+, an optimized version of HPF, which addresses the requirements of irregular applications. In addition to extended data distribution and work distribution mechanisms, HPF+ provides the user with language features for specifying certain information that decisively influence a program's performance. This comprises data locality assertions, non-local access specifications and the possibility of reusing runtime-generated communication schedules of irregular loops. Performance measurements of kernels from advanced applications demonstrate that with a high-level dataparallel language such as HPF+ a performance close to hand-written message-passing programs can be achieved even for highly irregular codes.
This paper presents an extension of a library for the Coq interactive theorem prover that enables the development of correct functional parallel programs based on sequential program transformation and automatic parall...
详细信息
ISBN:
(纸本)9781538620878
This paper presents an extension of a library for the Coq interactive theorem prover that enables the development of correct functional parallel programs based on sequential program transformation and automatic parallelization using an algorithmic skeleton named accumulate. Such an algorithmic skeleton is a pattern of a parallel algorithm that is provided as a high-order function implemented in parallel. The use of this framework is illustrated with the bracket matching problem, including experiments on a parallel machine.
High Performance Fortran is a language designed to support efficient data parallel programming on a variety of parallel machines. This kind of parallelprogramming has been proven to be very user-friendly, easy to deb...
详细信息
High Performance Fortran is a language designed to support efficient data parallel programming on a variety of parallel machines. This kind of parallelprogramming has been proven to be very user-friendly, easy to debug and easy to use. In this programming model, the programmer explicitly specifies the layout of data in a global space, relying on a compiler to generate a parallel program including all the communication. While this frees the programmers from the tedium of thinking about local name spaces and message-passing, no assistance is provided in determining an efficient data layout scheme on the target *** programming Translator (PPTran) is a compilation system that transforms dataparallel programs written in High Performance Fortran (HPF) with array extensions, parallel loops, and layout directives to parallel programs with explicit message passing using parallel Virtual Machine (PVM) library.
We propose a novel execution model for the implicitly parallel execution of dataparallel programs in the presence of general I/O operations. This model is called hybrid because it combines the advantages of the stand...
详细信息
We propose a novel execution model for the implicitly parallel execution of dataparallel programs in the presence of general I/O operations. This model is called hybrid because it combines the advantages of the standard execution models fork/join and SPMD. Based on program analysis the hybrid model adapts itself to one or the other on the granularity of individual instructions. We outline compilation techniques that systematically derive the organization of parallel code from data flow characteristics aiming at the reduction of execution mode switches in general and synchronization/communication requirements in particular. Experiments based on a prototype implementation show the effectiveness of the hybrid execution model for reducing parallel overhead.
暂无评论