检索结果-内蒙古大学图书馆

REDUCING THE EFFECT OF GLOBAL COMMUNICATION IN GMRES(M) AND CG ON PARALLEL distributed-memory computers

APPLIED NUMERICAL MATHEMATICS 1995年第4期18卷 441-459页

作者： DESTURLER, E VANDERVORST, HA UNIV UTRECHT INST MATH3508 TA UTRECHTNETHERLANDS SWISS FED INST TECHNOL ETHZIPSINTERDISCIPLINARY PROJECT CTR SUPERCOMPZURICHSWITZERLAND

In this paper we study possibilities for the reduction of communication overhead introduced by inner products in the iterative solution methods CG and GMRES(m). The performance of these methods on massively parallel distributed memory machines is often limited because of the global communication required for the inner products. We investigate two ways of improvement. One is to assemble the results of a number of inner products collectively. The other is to create situations where communication can be overlapped with computation. The matrix-vector products may also introduce some communication overhead, but for many relevant problems this involves only communication with a few nearby processors that is easily overlapped as well. So this may, but does not necessarily, further degrade the performance of the algorithm.

关键词： PARALLEL COMPUTING distributed memory computers CONJUGATE GRADIENT METHODS PERFORMANCE GMRES, MODIFIED GRAM-SCHMIDT

来源：评论

学校读者我要写书评

暂无评论

Parallel algorithms for adaptive mesh refinement

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 1997年第3期18卷 686-708页

作者： Jones, MT Plassmann, PE ARGONNE NATL LAB DIV MATH & COMP SCIARGONNEIL 60439

Computational methods based on the use of adaptively constructed nonuniform meshes reduce the amount of computation and storage necessary to perform many scientific calculations. The adaptive construction of such nonuniform meshes is an important part of these methods. In this paper, we present a parallel algorithm for adaptive mesh refinement that is suitable for implementation on distributed-memory parallel computers. Experimental results obtained on the Intel DELTA are presented to demonstrate that for scientific computations involving the finite element method, the algorithm exhibits scalable performance and has a small run time in comparison with other aspects of the scientific computations examined. It is also shown that the algorithm has a fast expected running time under the parallel random access machine (PRAM) computation model.

关键词： adaptive mesh refinement distributed memory computers parallel algorithms sparse matrices unstructured mesh computation

来源：评论

学校读者我要写书评

暂无评论

On runtime parallel scheduling for processor load balancing

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed SYSTEMS 1997年第2期8卷 173-186页

作者： Wu, MY IEEE

Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper presents an overview of the parallel scheduling technique. Scheduling algorithms for tree, hypercube, and mesh networks are presented. These algorithms can fully balance the load and maximize locality at runtime. Communication costs are significantly reduced compared to other existing algorithms.

关键词： distributed memory computers load balancing runtime parallel scheduling scheduling algorithms trees hypercubes meshes

来源：评论

学校读者我要写书评

暂无评论

Running finite-difference schemes for 3D diffusion problems on parallel computers with distributed memory

引用

Informatica 1996年第3期7卷 295-310页

作者： Čiegis, Raimondas Šimkevičius, Juozas Waśniewski, Jerzy Inst. of Mathematics and Informatics 2600 Vilnius Akademijos 4 Lithuania Vytautas Magnus University 3000 Kaunas Vileikos 8 Lithuania Danish Comp. Ctr. for Res. and Educ. UNI-C DTH DK-2800 Lyngby Denmark

In this papper we consider the problem of solving 3D diffusion problems on distributed memory computers. We present a parallel algorithm that is suitable for the number of processors less or equal 8. The pipelining method is used to enlarge the number of processors till 64. The computational grid decomposition method is proposed for heterogenous clusters of workstations which preserves the load balancing of computers. The numerical results for two clusters of workstations are given.

关键词： distributed memory computers Finite difference schemes LOD methods Parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

RUN-TIME LOAD BALANCING SUPPORT FOR A PARALLEL MULTIBLOCK EULER NAVIER-STOKES CODE WITH ADAPTIVE REFINEMENT ON distributed-memory computers

引用

PARALLEL COMPUTING 1994年第8期20卷 1069-1088页

作者： DEKEYSER, J LUST, K ROOSE, D K.U. Leuven Department of Computer Science Celestijnenlaan 200A B-3001 Leuven Belgium

This paper describes the parallel implementation of algorithms requiring run-time load redistribution with the aid of the parallel programming library LOCO. As a typical application, a 2D finite volume multiblock Euler/Navier-Stokes code with block-wise adaptive mesh refinement is discussed. The LOCO software handles the communication between blocks and the distribution of blocks among the processors, thereby performing automatic load balancing at run-time. The LOCO library is interfaced with both the native NX communication primitives on Intel iPSC hypercubes and the PVM software on workstation clusters. The parallel performance of the code on the Intel iPSC/860 and on a DEC Alpha workstation cluster is discussed. In particular the effects of mesh refinement on the load balance are investigated.

关键词： COMPUTATIONAL FLUID DYNAMICS PARALLEL SCIENTIFIC COMPUTING MESH REFINEMENT LOAD BALANCING distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

THE DESIGN OF A STANDARD MESSAGE-PASSING INTERFACE FOR distributed-memory CONCURRENT computers

引用

PARALLEL COMPUTING 1994年第4期20卷 657-673页

作者： WALKER, DW Mathematical Sciences Section Oak Ridge National Laboratory P.O. Box 2008 Bldg. 6012 Oak Ridge TN 37831-6367 USA

This paper presents an overview of MPI, a proposed standard message passing interface for MIMD distributed memory concurrent computers. The design of MPI has been a collective effort involving researchers in the United States and Europe from many organizations and institutions. MPI includes point-to-point and collective communication routines, as well as support for process groups, communication contexts, and application topologies. While making use of new ideas where appropriate, the MPI standard is based largely on current practice.

关键词： MESSAGE PASSING distributed memory computers STANDARDS POINT-TO-POINT COMMUNICATION COLLECTIVE COMMUNICATION PROCESS GROUPS COMMUNICATION CONTEXTS APPLICATION TOPOLOGIES

来源：评论

学校读者我要写书评

暂无评论

Parallel solutions of compressible flows using overlapping and non-overlapping mesh partitioning strategies

引用

PARALLEL COMPUTING 1996年第7期22卷 943-968页

作者： Lanteri, S INRIA 2004 Route des Lucioles B.P. 93 06902 Sophia-Antipolis Cedex France

Defining a good strategy for the parallelisation of an unstructured mesh based solver is a challenge, particularly when one aims at reaching a high level of performance while maintaining portability of the source code between scalar, vector and parallel machines. In this paper, we present parallel solutions of realistic three-dimensional flows obtained on the Intel Paragon, the Cray T3D and the IBM SP2 MPPs (Massively Parallel Processors). The solver under consideration is a representative subset of an existing industrial code, N3S-MUSCL which implements a mixed finite element/finite volume formulation on unstructured tetrahedral meshes. The adopted parallelisation strategy combines mesh partitioning techniques and a message-passing programming model. We compare in details performance results obtained with parallel solution strategies based on overlapping and non-overlapping mesh partitions.

关键词： Computational Fluid Dynamics Euler/Navier-Stokes equations unstructured meshes mesh partitioning distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

LOAD BALANCING DATA-PARALLEL PROGRAMS ON distributed-memory computers

引用

PARALLEL COMPUTING 1993年第11期19卷 1199-1219页

作者： DEKEYSER, J ROOSE, D Computer Science Department K.U. Leuven Celestijnenlaan 200A B-3001 Leuven Belgium

In this paper a set of programming constructs for the implementation of data parallel algorithms on distributed memory parallel computers is proposed. The load balancing problem for data parallel programs is cast in a special from. Its relation to the general load balancing problem is analyzed. The applicability of these constructs is asserted for a number of grid-oriented numerical applications. A software tool provides run-time support for data parallel programs based on the proposed constructs. While the application - according to the data parallel programming paradigm - partitions the grid, the tool assigns the partitions to the processors, using built-in mapping algorithms. The approach is general enough to accommodate for data parallel algorithms with varying communication structure and variable calculation requirements using pseudo-dynamic load balancing strategies.

关键词： DATA PARALLELISM LOAD BALANCING distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

AN IMPROVED SPECTRAL BISECTION ALGORITHM AND ITS APPLICATION TO DYNAMIC LOAD BALANCING

引用

PARALLEL COMPUTING 1995年第1期21卷 29-48页

作者： VANDRIESSCHE, R ROOSE, D Department of Computer Science Katholieke Uniuersiteit Leuven Celestijnenlaan 200 A B-3001 Leuven Belgium

The efficient parallel execution of grid-oriented scientific calculations requires a partitioning of the grid that minimises both load imbalance and interprocessor communication. For unstructured static grids, good partitions are obtained with the recursive spectral bisection heuristic, applied to the interdependency graph of the grid. We will describe an alternative spectral bisection algorithm that yields better partitions than the standard algorithm, especially for interdependency graphs with a large variation in the weights of the edges. We will further describe how even in case of dynamically changing grids, grid-oriented problems can be formulated as graph partitioning problems for the purpose of load balancing. We will then partition these dynamically changing grids with the alternative spectral algorithm.

关键词： distributed memory computers DYNAMIC LOAD BALANCING GRAPH PARTITIONING (RECURSIVE) SPECTRAL BISECTION

来源：评论

学校读者我要写书评

暂无评论

IMPLEMENTATION OF A BOUNDARY ELEMENT METHOD ON distributed memory computers

引用

PARALLEL COMPUTING 1992年第12期18卷 1317-1324页

作者： DAOUDI, E LOBRY, J FAC POLYTECH MONS BD DOLEZ 31 B-7000 MONS BELGIUM

In this paper, we analyse and compare different parallel implementations of the Boundary Element Method on distributed memory computers. We deal with the computation of two-dimensional magnetostatic problems. The resulting linear system will be solved using Householder transformation and Gaussian elimination. Experimental results are obtained on a Meiko Computing Surface with 32 T800 transputers.

关键词： BOUNDARY ELEMENT METHOD GAUSSIAN ELIMINATION HOUSEHOLDER REDUCTION PARALLEL ALGORITHMS distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：