检索结果-内蒙古大学图书馆

IMPLEMENTATION OF A BOUNDARY ELEMENT METHOD ON distributed memory computers

PARALLEL COMPUTING 1992年第12期18卷 1317-1324页

作者： DAOUDI, E LOBRY, J FAC POLYTECH MONS BD DOLEZ 31 B-7000 MONS BELGIUM

In this paper, we analyse and compare different parallel implementations of the Boundary Element Method on distributed memory computers. We deal with the computation of two-dimensional magnetostatic problems. The resulting linear system will be solved using Householder transformation and Gaussian elimination. Experimental results are obtained on a Meiko Computing Surface with 32 T800 transputers.

关键词： BOUNDARY ELEMENT METHOD GAUSSIAN ELIMINATION HOUSEHOLDER REDUCTION PARALLEL ALGORITHMS distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

The parallel 'Deutschland-Modell ' - A message-passing version for distributed memory computers

引用

PARALLEL COMPUTING 1997年第14期23卷 2215-2226页

作者： Schattler, U Krenzien, E Deutsch Wetterdienst D-63004 Offenbach Germany

The parallel 'Deutschland-Modell' and its implementation on distributed memory parallel computers using the message passing library PARMACS 6.0 is described. Performance results on a Gray T3D are given and the... 详细信息

关键词： distributed memory computers message passing scalability load imbalances

来源：评论

学校读者我要写书评

暂无评论

LOAD BALANCING DATA-PARALLEL PROGRAMS ON distributed-memory computers

引用

PARALLEL COMPUTING 1993年第11期19卷 1199-1219页

作者： DEKEYSER, J ROOSE, D Computer Science Department K.U. Leuven Celestijnenlaan 200A B-3001 Leuven Belgium

In this paper a set of programming constructs for the implementation of data parallel algorithms on distributed memory parallel computers is proposed. The load balancing problem for data parallel programs is cast in a special from. Its relation to the general load balancing problem is analyzed. The applicability of these constructs is asserted for a number of grid-oriented numerical applications. A software tool provides run-time support for data parallel programs based on the proposed constructs. While the application - according to the data parallel programming paradigm - partitions the grid, the tool assigns the partitions to the processors, using built-in mapping algorithms. The approach is general enough to accommodate for data parallel algorithms with varying communication structure and variable calculation requirements using pseudo-dynamic load balancing strategies.

关键词： DATA PARALLELISM LOAD BALANCING distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

Runtime incremental parallel scheduling (RIPS) on distributed memory computers

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed SYSTEMS 1996年第6期7卷 637-649页

作者： Shu, W Wu, MY Department of Computer Science State University of New York University at Buffalo Buffalo NY USA

Runtime Incremental Parallel Scheduling (RIPS) is an alternative strategy to the commonly used dynamic scheduling. in this scheduling strategy, the system scheduling activity alternates with the underlying computation work. RIPS utilizes the advanced parallel scheduling technique to produce a low-overhead, high-quality load balancing as well as adapting to irregular applications. This paper presents methods for scheduling a single job on a dedicated parallel machine.

关键词： runtime load balancing incremental scheduling parallel scheduling irregular and dynamic applications distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

Numerics of high performance computers and benchmark evaluation of distributed memory computers

引用

DEFENCE SCIENCE JOURNAL 2004年第3期54卷 361-377页

作者： Krishna, HS Singh, KP Aeroneut Dev Agcy Bangalore 560017 Karnataka India

The internal representation of numerical data, their speed of manipulation to generate the desired result through efficient utilisation of central processing unit, memory, and communication links are essential steps of all high performance scientific computations. Machine parameters, in particular, reveal accuracy and error bounds of computation, required for performance tuning of codes. This paper reports diagnosis of machine parameters, measurement of computing power of several workstations, serial and parallel computers, and a component-wise test procedure for distributed memory computers. Hierarchical memory structure is illustrated by block copying and unrolling techniques. Locality of reference for cache reuse of data is amply demonstrated by fast Fourier transform codes. Cache and register-blocking technique results in their optimum utilisation with consequent gain in throughput during vector-matrix operations. Implementation of these memory management techniques reduces cache inefficiency loss, which is known to be proportional to the number of processors. Of the two Linux clusters-ANUP16, HPC22 and HPC64, it has been found from the measurement of intrinsic parameters and from application benchmark of multi-block Euler code test run that ANUP16 is suitable for problems that exhibit fine-grained parallelism. The delivered performance of ANUP16 is of immense utility for developing high-end PC clusters like HPC64 and customised parallel computers with added advantage of speed and high degree of parallelism.

关键词： bandwidth benchmark cache communication link fast Fourier transformation granularity iteration Linux cluster latency multigrid Mflops machine parameters memory PIM matrix refinement distributed memory computers high power scientific computations parallel iterative method

来源：评论

学校读者我要写书评

暂无评论

RUN-TIME LOAD BALANCING SUPPORT FOR A PARALLEL MULTIBLOCK EULER NAVIER-STOKES CODE WITH ADAPTIVE REFINEMENT ON distributed-memory computers

引用

PARALLEL COMPUTING 1994年第8期20卷 1069-1088页

作者： DEKEYSER, J LUST, K ROOSE, D K.U. Leuven Department of Computer Science Celestijnenlaan 200A B-3001 Leuven Belgium

This paper describes the parallel implementation of algorithms requiring run-time load redistribution with the aid of the parallel programming library LOCO. As a typical application, a 2D finite volume multiblock Euler/Navier-Stokes code with block-wise adaptive mesh refinement is discussed. The LOCO software handles the communication between blocks and the distribution of blocks among the processors, thereby performing automatic load balancing at run-time. The LOCO library is interfaced with both the native NX communication primitives on Intel iPSC hypercubes and the PVM software on workstation clusters. The parallel performance of the code on the Intel iPSC/860 and on a DEC Alpha workstation cluster is discussed. In particular the effects of mesh refinement on the load balance are investigated.

关键词： COMPUTATIONAL FLUID DYNAMICS PARALLEL SCIENTIFIC COMPUTING MESH REFINEMENT LOAD BALANCING distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

REDUCING THE EFFECT OF GLOBAL COMMUNICATION IN GMRES(M) AND CG ON PARALLEL distributed-memory computers

引用

APPLIED NUMERICAL MATHEMATICS 1995年第4期18卷 441-459页

作者： DESTURLER, E VANDERVORST, HA UNIV UTRECHT INST MATH3508 TA UTRECHTNETHERLANDS SWISS FED INST TECHNOL ETHZIPSINTERDISCIPLINARY PROJECT CTR SUPERCOMPZURICHSWITZERLAND

In this paper we study possibilities for the reduction of communication overhead introduced by inner products in the iterative solution methods CG and GMRES(m). The performance of these methods on massively parallel distributed memory machines is often limited because of the global communication required for the inner products. We investigate two ways of improvement. One is to assemble the results of a number of inner products collectively. The other is to create situations where communication can be overlapped with computation. The matrix-vector products may also introduce some communication overhead, but for many relevant problems this involves only communication with a few nearby processors that is easily overlapped as well. So this may, but does not necessarily, further degrade the performance of the algorithm.

关键词： PARALLEL COMPUTING distributed memory computers CONJUGATE GRADIENT METHODS PERFORMANCE GMRES, MODIFIED GRAM-SCHMIDT

来源：评论

学校读者我要写书评

暂无评论

THE DESIGN OF A STANDARD MESSAGE-PASSING INTERFACE FOR distributed-memory CONCURRENT computers

引用

PARALLEL COMPUTING 1994年第4期20卷 657-673页

作者： WALKER, DW Mathematical Sciences Section Oak Ridge National Laboratory P.O. Box 2008 Bldg. 6012 Oak Ridge TN 37831-6367 USA

This paper presents an overview of MPI, a proposed standard message passing interface for MIMD distributed memory concurrent computers. The design of MPI has been a collective effort involving researchers in the United States and Europe from many organizations and institutions. MPI includes point-to-point and collective communication routines, as well as support for process groups, communication contexts, and application topologies. While making use of new ideas where appropriate, the MPI standard is based largely on current practice.

关键词： MESSAGE PASSING distributed memory computers STANDARDS POINT-TO-POINT COMMUNICATION COLLECTIVE COMMUNICATION PROCESS GROUPS COMMUNICATION CONTEXTS APPLICATION TOPOLOGIES

来源：评论

学校读者我要写书评

暂无评论

The Scalable Modeling System: directive-based code parallelization for distributed and shared memory computers

引用

PARALLEL COMPUTING 2003年第8期29卷 995-1020页

作者： Govett, M Hart, L Henderson, T Middlecoff, J Schaffer, D Natl Ocean & Atmospher Adm Forecast Syst Lab Boulder CO 80305 USA Natl Ctr Atmospher Res Global Climate & Dynam Div Boulder CO 80305 USA Colorado State Univ Cooperat Inst Res Atmosphere Ft Collins CO 80523 USA

A directive-based parallelization tool called the Scalable Modeling System (SMS) is described. The user inserts directives in the form of comments into existing Fortran code. SMS translates the code and directives into a parallel version that runs efficiently on shared and distributed memory high-performance computing platforms including the SGI Origin, IBM SP2, Cray T3E, Sun, and Alpha and Intel clusters. Twenty directives are available to support operations including array re-declarations, inter-process communications, loop translations, and parallel I/O operations. SMS also provides tools to support incremental parallelization and debugging that significantly reduces code parallelization. time from months to weeks of effort. SMS is intended for applications using regular structured grids that are solved using finite difference approximation or spectral methods. It has been used to parallelize 10 atmospheric and oceanic models, but the tool is sufficiently general that it can be applied to other structured grids codes. Recent performance comparisons demonstrate that the Eta, Hybrid Coordinate Ocean model and Regional Ocean Modeling System model, parallelized using SMS, perform as well or better than their OpenMP or Message Passing Interface counterparts. (C) 2003 Elsevier B.V. All rights reserved.

关键词： directive-based parallelization tool weather and ocean models automatic parallel code generation Fortran source code translator distributed memory computers

来源：评论

学校读者我要写书评

暂无评论

Running finite-difference schemes for 3D diffusion problems on parallel computers with distributed memory

引用

Informatica 1996年第3期7卷 295-310页

作者： Čiegis, Raimondas Šimkevičius, Juozas Waśniewski, Jerzy Inst. of Mathematics and Informatics 2600 Vilnius Akademijos 4 Lithuania Vytautas Magnus University 3000 Kaunas Vileikos 8 Lithuania Danish Comp. Ctr. for Res. and Educ. UNI-C DTH DK-2800 Lyngby Denmark

In this papper we consider the problem of solving 3D diffusion problems on distributed memory computers. We present a parallel algorithm that is suitable for the number of processors less or equal 8. The pipelining method is used to enlarge the number of processors till 64. The computational grid decomposition method is proposed for heterogenous clusters of workstations which preserves the load balancing of computers. The numerical results for two clusters of workstations are given.

关键词： distributed memory computers Finite difference schemes LOD methods Parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：