检索结果-内蒙古大学图书馆

EFFICIENT TRIDIAGONAL SOLVERS ON MULTICOMPUTERS

IEEE TRANSACTIONS ON COMPUTERS 1992年第3期41卷 286-296页

作者： SUN, XH ZHANG, H NI, LM CLEMSON UNIV DEPT MATH SCICLEMSONSC 29634 MICHIGAN STATE UNIV DEPT COMP SCIADV COMP SYST LABE LANSINGMI 48824

Three parallel algorithms, namely the parallel partition LU (PPT) algorithm, the parallel partition hybrid (PPH) algorithm, and the parallel diagonal dominant (PDD) algorithm are proposed for solving tridiagonal linear systems on multicomputers. These algorithms are based on the divide-and-conquer parallel computation model. The PPT and PPH algorithms support both pivoting and nonpivoting. The PPT algorithm is good when the number of processors is small;otherwise, the PPH algorithm is better. When the system is diagonal dominant, the PDD algorithm is highly parallel and provides an approximate solution which equals to the exact solution within machine accuracy. Both computation and communication complexities of the three algorithms are presented. All three methods proposed in this paper have been implemented on a 64-node nCUBE-1 multicomputer. The analytic results match closely with the results measured from the nCUBE-1 machine.

关键词： COMMUNICATION COMPLEXITY DIVIDE-AND-CONQUER LU DECOMPOSITION MATRIX PARTITIONING parallel numerical algorithms MULTICOMPUTERS TRIDIAGONAL SYSTEMS

来源：评论

学校读者我要写书评

暂无评论

Efficient deterministic parallel simulation of 2D semiconductor devices based on WENO-Boltzmann schemes

引用

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING 2009年第5-8期198卷 693-704页

作者： Mantas, Jose M. Caceres, Maria J. Univ Granada Dept Lenguajes & Sistemas Informat ETS Ing Informat & Telecomunicac E-18071 Granada Spain Univ Granada Fac Ciencias Dept Matemat Aplicada Granada 18002 Spain

A flexible parallel deterministic solver of the Boltzmann-Poisson system for 2D semiconductor device simulation on computer clusters is presented. The simulator is obtained by parallelizing a previously proposed numerical scheme based on high order finite difference weighted essentially non-oscillatory (WENO) schemes. Although the underlying numerical scheme presents important advantages over direct simulation Monte Carlo methods, this scheme imposes very high demands of computing power. Due to this, the parallelization of the different calculation phases in the numerical scheme has been tackled. The data subdomain which demands most of the computational workload has been suitably distributed among the processors and several parallel design decisions has been taken in order to achieve good performance. Moreover, the resultant parallel application can be easily adjusted to simulate a wide range of devices and could be easily used by engineers without mathematical background about the underlying numerical scheme. The parallel algorithm has been implemented in C++ augmented with calls to MPI functions and functions of optimized linear algebra libraries. Several experiments have been performed by simulating particular MOSFET and DG-MOSFET devices on a SMP cluster in order to show its efficiency. (C) 2008 Elsevier B.V. All rights reserved.

关键词： Semiconductor simulation parallel numerical algorithms Finite difference weighted essentially non-oscillatory schemes High performance cluster computing

来源：评论

学校读者我要写书评

暂无评论

numerical solution of the expanding stellar atmosphere problem

引用

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS 1999年第1-2期109卷 41-63页

作者： Hauschildt, PH Baron, E Univ Georgia Dept Phys & Astron Athens GA 30602 USA Univ Georgia Ctr Simulat Phys Athens GA 30602 USA Univ Oklahoma Dept Phys & Astron Norman OK 73019 USA

In this paper we discuss numerical methods and algorithms for the solution of NLTE stellar atmosphere problems involving expanding atmospheres, e.g., found in novae, supernovae and stellar winds. We show how a scheme of nested iterations can be used to reduce the high dimension of the problem to a number of problems with smaller dimensions. As examples of these sub-problems, we discuss the numerical solution of the radiative transfer equation for relativistically expanding media with spherical symmetry, the solution of the multi-level nonLTE statistical equilibrium problem for extremely large model atoms, and our temperature correction procedure. Although modern iteration schemes are very efficient, parallel algorithms are essential in making large-scale calculations feasible, therefore we discuss some parallelization schemes that we have developed. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： stellar atmospheres radiative transfer parallel numerical algorithms NLTE

来源：评论

学校读者我要写书评

暂无评论

Globally convergent iterative numerical schemes for nonlinear variational image smoothing and segmentation on a multiprocessor machine

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2001年第6期10卷 852-864页

作者： Heers, J Schnörr, C Stiehl, HS LaVision GmbH Gottingen Germany Univ Mannheim CVGPR Grp Mannheim Germany Univ Hamburg Cognit Syst Grp Hamburg Germany

We investigate several iterative numerical schemes for nonlinear variational image smoothing and segmentation implemented in parallel, A general iterative framework subsuming these schemes is suggested for which global convergence irrespective of the starting point can be shown. We characterize various edge-preserving regularization methods from the recent image processing literature involving auxiliary variables as special cases of this general framework. As a by-product, global convergence can be proven under conditions slightly weaker than those stated in the literature. Efficient Krylov subspace solvers for the linear parts of these schemes have been implemented on a multi-processor machine. The performance of these parallel implementations has been assessed and empirical results concerning convergence rates and speed-up factors are reported.

关键词： adaptive smoothing auxiliary variables images and pdes nonlinear regularization parallel numerical algorithms variational segmentation

来源：评论

学校读者我要写书评

暂无评论

COMMENTS ON SCHEDULING parallel ITERATIVE METHODS ON MULTIPROCESSOR SYSTEMS-II

引用

parallel COMPUTING 1989年第2期11卷 241-244页

作者： EISENSTAT, SC Department of Computer Science Yale University P.O. Box 2158 New Haven CT 06520 U.S.A.

We present a new parallel implementation of the Gauss-Seidel iteration for solving systems of linear equations, improving the results presented in two recent papers.

关键词： parallel iterative methods parallel numerical algorithms task graph

来源：评论

学校读者我要写书评

暂无评论

COMMENTS ON SCHEDULING parallel ITERATIVE METHODS ON MULTIPROCESSOR SYSTEMS

引用

parallel COMPUTING 1988年第2期7卷 253-255页

作者： ROBERT, Y TRYSTRAM, D CNRS Laboratoire TIM3 INPG 38031 Grenoble Cedex France

In this note we improve results presented in the paper: N.M. Missirlis, Scheduling parallel iterative methods on multiprocessor systems, parallel Computing 5 (1987) 295–302.

关键词： parallel iterative methods parallel numerical algorithms task graph

来源：评论

学校读者我要写书评

暂无评论

Computational forces in the Linpack benchmark

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2008年第9期68卷 1283-1290页

作者： Numrich, Robert W. Univ Minnesota Minnesota Supercomp Inst Minneapolis MN 55455 USA

Dimensional analysis reduces a complicated ten-parameter formula for the execution time of the Linpack benchmark to a simpler two-parameter formula. These two parameters are ratios of software forces and hardware forces that determine a self-similarity Surface. Machines move along paths on this surface as the problem size and the number of processors change. Two machines scale the same way, they move along the same path, if they have the same hardware forces. To design efficient algorithms, the programmer must produce software forces large enough to overcome the hardware forces. Modern machines have larger hardware forces than older machines and are harder to program. (C) 2008 Elsevier Inc. All rights reserved.

关键词： scalability dimensional analysis performance analysis computational intensity computational force parallel numerical algorithms Linpack benchmark

来源：评论

学校读者我要写书评

暂无评论

Tests and tolerances for high-performance software-implemented fault detection

引用

IEEE TRANSACTIONS ON COMPUTERS 2003年第5期52卷 579-591页

作者： Turmon, M Granat, R Katz, DS Lou, JZ Jet Prop Lab Data Understanding Syst Grp Pasadena CA 91109 USA Jet Prop Lab Parallel Applicat Technol Grp Pasadena CA 91109 USA

We describe and test a software approach to fault detection in common numerical algorithms. Such result checking or algorithm-based fault tolerance (ABFT) methods may be used, for example, to overcome single-event upsets in computational hardware or to detect errors in complex, high-efficiency implementations of the algorithms. Following earlier work, we use checksum methods to validate results returned by a numerical subroutine operating subject to unpredictable errors in data. We consider common matrix and Fourier algorithms which return results satisfying a necessary condition having a linear form;the checksum tests compliance with this condition. We discuss the theory and practice of setting numerical tolerances to separate errors caused by a fault from those inherent in finite-precision floating-point calculations. We concentrate on comprehensively defining and evaluating tests having various accuracy/computational burden tradeoffs, and we emphasize average-case algorithm behavior rather than using worst-case upper bounds on error.

关键词： algorithm-based fault tolerance result checking error analysis aerospace parallel numerical algorithms

来源：评论

学校读者我要写书评

暂无评论

SCHEDULING parallel ITERATIVE METHODS ON MULTIPROCESSOR SYSTEMS

引用

parallel COMPUTING 1987年第3期5卷 295-302页

作者： MISSIRLIS, NM Department of Mathematics University of Athens Athens Greece

The paper describes the implementation of the Successive Overrelaxation (SOR) method on an asynchronous multiprocessor computer for solving large, linear systems. The parallel algorithm is derived by dividing the serial SOR method into noninterfering tasks which are then combined with an optimal schedule of a feasible number of processors. The important features of the algorithm are: (i) achieves a speedup Sp ? O(N/3) and an efficiency Ep ? 2/3 using p = [N/2] processors, where N is the number of the equations, (ii) contains a high level of inherent parallelism, whereas on the other hand, the convergence theory of the parallel SOR method is the same as its sequential counterpart and (iii) may be modified to use block methods in order to minimise the overhead due to communication and synchronisation of the processors.

关键词： parallel numerical algorithms parallel iterative methods parallel SOR method large sparse linear systems

来源：评论

学校读者我要写书评

暂无评论

RESTRUCTURING SIMPLE FOR THE CHIP ARCHITECTURE

引用

parallel COMPUTING 1986年第4期3卷 305-326页

作者： GANNON, D PANETTA, J PURDUE UNIV DEPT COMP SCIW LAFAYETTEIN 47907

The SIMPLE program is a commonly used benchmark for testing new architectures designed for high speed scientific computation. As the name implies, the code is a simple example of a Lagrangian hydrodynamics application. In this paper we describe the SIMPLE benchmark in detail and discuss the way in which parallelism can be used to speed up execution. The focus of the work is a mapping of the algorithms to a configurable highly parallel (CHiP) computer being designed at the University of Washington.

关键词： parallel numerical algorithms Lagrangian hydrodynamics parallelism architecture configurable networks performance analysis SIMPLE CHiP

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：