检索结果-内蒙古大学图书馆

MCGS: A modified conjugate gradient squared algorithm for nonsymmetric linear systems

JOURNAL OF SUPERCOMPUTING 1999年第3期14卷 257-280页

作者： Maheswaran, M Webb, KJ Siegel, HJ Univ Manitoba Dept Comp Sci Winnipeg MB R3T 2N2 Canada Purdue Univ Sch Elect & Comp Engn Parallel Proc Lab W Lafayette IN 47907 USA

The conjugate gradient squared (CGS) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems (Ax=b) with complex nonsymmetric, very large, and very sparse coefficient matrices (A). By considering electromagnetic scattering problems as examples, a study of the performance and scalability of this algorithm on two MIMD machines is presented. A modified CGS (MCGS) algorithm, where the synchronization overhead is effectively reduced by a factor of two, is proposed in this paper. This is achieved by changing the computation sequence in the CGS algorithm. Both experimental and theoretical analyses are performed to investigate the impact of this modification on the overall execution time. From the theoretical and experimental analysis it is found that CGS is faster than MCGS for smaller number of processors and MCGS outperforms CGS as the number of processors increases. Based on this observation, a set of algorithms approach is proposed, where either CGS or MGS is selected depending on the values of the dimension of the A matrix (N) and number of processors (P). The set approach provides an algorithm that is more scalable than either the CGS or MCGS algorithms. The experiments performed on a 128-processor mesh Intel Paragon and on a 16-processor IBM SP2 with multistage network indicate that MCGS is approximately 20% faster than CGS.

关键词： algorithm scalability conjugate gradient squared modified conjugate gradient squared Intel Paragon IBM SP-2 MIMD synchronization

来源：评论

学校读者我要写书评

暂无评论

Lattice QCD on the IBM scalable POWERParallel systems SP2

Lattice QCD on the IBM scalable POWERParallel systems SP2

引用

1995 ACM/IEEE Supercomputing Conference (SC 95)

作者： Bernard, C DeTar, C Gottlieb, S Heller, UM Hetrick, J Ishizuka, N Karkkainen, L Lantz, SR Rummukainen, K Sugar, R Toussaint, D Wingate, M Washington Univ St. Louis United States

ISBN: (纸本)0897918622

A 512 node IBM Scalable POWERParallel Systems SP2 was installed at the Cornell Theory Center in October 1994. During the past couple of months we have Seen porting and optimizing code for carrying out lattice QCD calculations. Present performance is far from ideal, however, and optimization efforts are still under way. The rate limiting step in our code involves a rather generic inversion of. a large, sparse system, based on a partial differential equation in a multidimensional space. The insights we have gained so far may be useful in diagnosing performance in a wide class of applications. Copyright 1995 by the Association for Computing Machinery, Inc. (ACM).

关键词： IBM SP2 massively parallel processing lattice quantum chromodynamics MPL message passing algorithm optimization algorithm performance algorithm scalability partial differential equations sparse matrix problems Cornell Theory Center

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：