Most of the current compiler projects for distributed memory architectures leave the critical and time-consuming problem of finding performance-efficient data distributions and profitable program transformations for a...
Most of the current compiler projects for distributed memory architectures leave the critical and time-consuming problem of finding performance-efficient data distributions and profitable program transformations for a given parallel program almost entirely to the programmer. Performance estimators provide critical performance information to both programmers and parallelizing compilers, the most crucial part of which involves determining the communication overhead induced by a program. In this paper, we present a very practical approach to the problem of compile-time estimation of communication costs for regular codes that includes analytical methods to model the number of messages exchanged, data volume transferred, transfer time, and network contention. In order to achieve high estimation accuracy, our estimator aggressively exploits compiler analysis and optimization information. It is assumed that machine parameters and problem size are known at compile time. We conducted a variety of experiments to validate the estimation accuracy and the ability to support both the programmer and compiler in the effort of performance tuning of parallel programs. We believe that our approach can be automatically applied to a large class of regular codes. (C) 1996 Academic Press, Inc.
Highly parallel scalable multiprocessing systems (HMPs) are powerful tools for solving large-scale scientific and engineering problems. However, these machines are difficult to program since algorithms must exploit lo...
详细信息
Highly parallel scalable multiprocessing systems (HMPs) are powerful tools for solving large-scale scientific and engineering problems. However, these machines are difficult to program since algorithms must exploit locality in order to achieve high performance. Vienna Fortran was the first fully specified data-parallel language for HMPs that provided features for the specification of data distribution and alignment at a high level of abstraction. In this paper we outline the major elements of Vienna Fortran and compare it to High Performance Fortran (HPF), a de-facto standard in this area. A significant weakness of HPF is its lack of support for many advanced applications, which require irregular data distributions and dynamic load balancing. We introduce HPF+, an extension of HPF based on Vienna Fortran, that provides the required functionality.
Distributed-memory systems are powerful tools for solving large-scale scientific and engineering problems. However these machines are difficult to program since the data have to be distributed across the processors an...
详细信息
Distributed-memory systems are powerful tools for solving large-scale scientific and engineering problems. However these machines are difficult to program since the data have to be distributed across the processors and message-passing operations must be inserted for communicating non-local data. In this paper, we discuss SUPERB and Vienna Fortran, two related developments with the objective of providing the user with a higher level programming paradigm while not sacrificing target code performance. The parallelization system SUPERB was developed in the German supercomputer project SUPRENUM from 1985 to 1989. It is based on the Single-Program-Multiple-Data (SPMD) paradigm, allows the use of global addresses, and automatically inserts the necessary communication statements, given a user-supplied data distribution. SUPERB was the first implemented system that translated sequential Fortran 77 into explicitly parallel message-passing Fortran. As a result of the experiences with SUPERB and related research, the language Vienna Fortran was designed within the ESPRIT project GENESIS, in a joint effort of the University of Vienna and ICASE, Nasa Langley Research Center. Vienna Fortran is a machine-independent language extension to Fortran, which includes a broad range of features for the high-level support of advanced application development for distributed-memory multiprocessors. It has significantly influenced the development of High Performance Fortran, a first attempt of language standardization in this area.
High Performance Fortran (HPF) offers an attractive high-level language interface for programming scalable parallel architectures providing the user with directives for the specification of data distribution and deleg...
详细信息
High Performance Fortran (HPF) offers an attractive high-level language interface for programming scalable parallel architectures providing the user with directives for the specification of data distribution and delegating to the compiler the task of generating an explicitly parallel program. Available HPF compilers can handle regular codes quite efficiently, but dramatic performance losses may be encountered for applications which are based on highly irregular, dynamically changing data structures and access patterns. In this paper we introduce the Vienna Fortran Compiler (VFC), a new source-to-source parallelization system for HPF+, an optimized version of HPF, which addresses the requirements of irregular applications. In addition to extended data distribution and work distribution mechanisms, HPF+ provides the user with language features for specifying certain information that decisively influence a program's performance. This comprises data locality assertions, non-local access specifications and the possibility of reusing runtime-generated communication schedules of irregular loops. Performance measurements of kernels from advanced applications demonstrate that with a high-level data parallel language such as HPF+ a performance close to hand-written message-passing programs can be achieved even for highly irregular codes.
VLSI parallel algorithms for a solution of fundamental elliptic problems with Laplace operators (Dirichlet and first boundary value problem for Poisson and biharmonic equation respectively) on a rectangular N x N grid...
详细信息
ISBN:
(纸本)3540593934
VLSI parallel algorithms for a solution of fundamental elliptic problems with Laplace operators (Dirichlet and first boundary value problem for Poisson and biharmonic equation respectively) on a rectangular N x N grid are proposed. A standard multigrid algorithm is adopted for Poisson equation which allows a parallel solution of this problem in T = O(logN) parallel steps. A special network consisting of N x N processor elements and of O(NlogN) interconnection lines in each direction results in a design the area of which is A = O(N(2)log(2)N). AT(2) estimation for a complexity of this Poisson solver is O(N(2)log(4)N) which improves the best result known until now by a factor of O(N/logN). This VLSI multigrid Poisson solver is applied to the semidirect method for solving the biharmonic equation. The parallel time of the algorithm is O(root-log(2)N) and the area needed is A = O(N(3)logN). The total complexity for such VLSI semidirect solver is AT(2) = O(N(4)log(5)N).
Using software for large-scale simulations has become an important research method in many disciplines. With increasingly complex simulations, simulation software becomes a valuable assest. Yet, the quality of many si...
详细信息
Large scale irregular applications involve data arrays and other data structures that are too large to fit in main memory and hense reside on disks. This paper presents a method for implementing this kind of applicati...
详细信息
We describe the design of a compilation system, which translates Fortran programs automatically into explicitly parallel programs for a massively parallel architecture. Such a compiler must automatically generate data...
详细信息
Recently, a standard set of extensions for Fortran 90, called High Performance Fortran (HPF), has been developed which would provide a portable interface to a wide variety of parallel architectures. HPF focuses mainly...
详细信息
暂无评论