One of the most intellectual steps in compiling for distributed memory parallel machines is to determine a suitable data partitioning scheme for a particular program. Most of the parallelizing compilers for these mach...
详细信息
One of the most intellectual steps in compiling for distributed memory parallel machines is to determine a suitable data partitioning scheme for a particular program. Most of the parallelizing compilers for these machines provide no or little support to the user in this difficult task. We have developed DPART, an automatic data partitioning system for Fortran 77 procedures. This paper describes the partitioning strategics of alignment, distribution, and processor layout in DPART. Finally we present experimental results for TRED2, DGEFA, and JACOBI procedures to demonstrate the effectiveness of this system.
In compiling applications for distributedmemorymachines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is b...
详细信息
In compiling applications for distributedmemorymachines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes, These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). In this paper, we present runtime and compile-time analysis for compiling such applications on distributed memory parallel machines in an efficient and machine-independent fashion. We have designed and implemented a runtime library which supports the runtime analysis required. The library is currently implemented on several different systems. We have also developed compiler analysis for determining data access patterns at compile-time and inserting calls to the appropriate runtime routines. Our methods can be used by compilers for HPF-like parallel programming languages in compiling codes in which data distribution, loop bounds and/or strides are unknown at compile-time. To demonstrate the efficacy of our approach, we have implemented our compiler analysis in the Fortran 90D/HPF compiler developed at Syracuse University, We have experimented with a multiblock Navier-Stokes solver template and a multigrid code. Our experimental results show that our primitives have low runtime communication overheads and the compiler parallelized codes perform within 20% of the codes parallelized by manually inserting calls to the runtime library.
暂无评论