检索结果-内蒙古大学图书馆

parallel Solutions for Large-Scale General Sparse Nonlinear Systems of Equations

Journal of Computer Science & Technology 1996年第3期11卷 257-271页

作者：胡承毅 Department of Computer and Mathematical Sciences University of Houston-Downtown Houston U.S.A

In solving application problems, many largesscale nonlinear systems of equations result in sparse Jacobian matrices. Such nonlinear systems are called sparse nonlinear systems. The irregularity of the locations of nonzero elements of a general sparse matrix makes it very difficult to generally map sparse matrix computations to multiprocessors for parallel processing in a well balanced manner. To overcome this difficulty, we define a new storage scheme for general sparse matrices in this paper. With the new storage scheme, we develop parallel algorithms to solve large-scale general sparse systems of equations by interval Newton/Generalized bisection methods which reliably find all numerical solutions within a given *** Section 1, we provide an introduction to the addressed problem and the interval Newton's methods. In Section 2, some currently used storage schemes for sparse sys-terns are reviewed. In Section 3, new index schemes to store general sparse matrices are reported. In Section 4, we present a parallel algorithm to evaluate a general sparse Jarobian matrix. In Section 5, we present a parallel algorithm to solve the correspond-ing interval linear 8ystem by the all-row preconditioned scheme. Conclusions and future work are discussed in Section 6.

关键词： Nonlinear systems of equations,sparse matrix,index storage schemes,interval Newton/generalized bisection algorithm parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

parallel computation of exact Euclidean distance transform

引用

parallel COMPUTING 1996年第2期22卷 311-325页

作者： Lee, YH Horng, SJ Kao, TW Jaung, FS Chen, YJ Tsai, HR NATL TAIWAN INST TECHNOL DEPT ELECT ENGNTAIPEI 10772TAIWAN

Distance transform is extensively used in image processing, such as expanding, shrinking, thinning, computing shape factor, etc. There are many approximate Euclidean distance transform (EDT) algorithms in the literature, but finding the exact Euclidean distance transform (EDT) with respect to the Euclidean distance metric is better in various application fields. Unless the digital image is very small, it is rather time consuming to find the exact Euclidean distance transform of an image. So, it is important to improve the computing speed. In this paper we study the parallel computation of the exact Euclidean distance transform for two parallel architectures: the EREW PRAM model and the SIMD hypercube computer. The parallel algorithm is given for the computation of exact Euclidean distance transform for all pixels with respect to black pixels in an N X N black and white image. The running time is O(log(2) N) both in the EREW PRAM model and the hypercube computer with N X N processors.

关键词： distance mapping distance transform approximate Euclidean distance transform Exact Euclidean distance transform EREW PRAM hypercube computer computer vision image processing parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

A highly parallel systolic tridiagonal solver

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 1996年第9期E79D卷 1241-1247页

作者： Naritomi, T Aso, H TOHOKU UNIV FAC ENGNSENDAIMIYAGI 98077JAPAN

Many numerical simulation problems of natural phenomena are formulated by large tridiagonal and block tridiagonal linear systems. In this paper, an efficient parallel algorithm to solve a tridiagonal linear system is proposed. The algorithm named bi-recurrence algorithm has an inherent parallelism which is suitable for parallel processing. Its time complexity is 8N-4 for a tridiagonal linear system of order N. The complexity is little more than the Gaussian elimination algorithm. For parallel implementation with two processors, the time complexity is 4N-1. Based on the bi-recurrence algorithm, a VLSI oriented tridiagonal solver is designed, which has an architecture of 1-D linear systolic array with three processing cells. The systolic tridiagonal solver completes finding the solution of a tridiagonal linear system in 3N+6 units of time. A highly parallel systolic tridiagonal solver is also presented. The solver is characterized by highly parallel computability which originates in the divide-and-conquer strategy and high cost performance which originates in the systolic architecture. This solver completes finding the solution in 10(N/p) + 6p + 23 time units, where p is the number of partitions of the system.

关键词： scientific computing parallel processing parallel algorithm tridiagonal linear system systolic array

来源：评论

学校读者我要写书评

暂无评论

parallel move generation system for computer chess

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 1996年第4期E79D卷 290-296页

作者： Ke, YF Parng, TM Department of Electrical Engineering National Taiwan University Taipei 10617 Taiwan

This paper presents a parallel move generation of a Chess machine system for achieving the purpose of reducing the number of move generation cycles. The parallel system is composed of five move generation modules which share the move generating cycles to reduce the time of building a game tree. Simulation results show that the proposed parallel move generation architecture takes about half of the number of move generation cycles to build a game tree that is the same as the one built by a sequential move generation module.

关键词： computer chess hardware architecture parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Synchronous and asynchronous parallel simulated annealing with multiple Markov chains

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 1996年第10期7卷 993-1008页

作者： Lee, SY Lee, KG SAMSUNG ELECT CO NETWORK RES GRPSEOUL 138160SOUTH KOREA

Simulated annealing is a general-purpose optimization technique capable of finding an optimal or near-optimal solution in various applications. However, the long execution time required for a good quality solution has been a major drawback in practice. Extensive studies have been carried out to develop parallel algorithms for simulated annealing. Most of them were not very successful, mainly because multiple processing elements (PEs) were required to follow a single Markov chain and, therefore, only a limited parallelism was exploited. In this paper, we propose new parallel simulated annealing algorithms which allow multiple Markov chains to be traced simultaneously by PEs which may communicate with each other. We have considered both synchronous and asynchronous implementations of the algorithms. Their performance has been analyzed in detail and also verified by extensive experimental results. It has been shown that for graph partitioning the proposed parallel simulated annealing schemes can find a solution of equivalent (or even better) quality up to an order of magnitude faster than the conventional parallel schemes. Among the proposed schemes, the one where PEs exchange information dynamically (not with a fixed period) performs best.

关键词： asynchronous communication graph partitioning multiple Markov chains parallel algorithm simulated annealing solution quality speed-up

来源：评论

学校读者我要写书评

暂无评论

A parallel preconditioned iterative realization of the panel method in 3D

引用

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS 1996年第1期3卷 65-80页

作者： Pester, M Rjasanow, S UNIV KAISERSLAUTERN DEPT MATHD-67653 KAISERSLAUTERNGERMANY

The parallel version of precondition iterative techniques is developed for matrices arising from the panel boundary element method for three-dimensional simple connected domains with Dirichlet boundary conditions. Results were obtained on an nCube-2 parallel computer showing that preconditoned iterative methods are very well suited also in three-dimensional cases for implementation on an MIMD computer and that they are much more efficient than usual direct solution techniques.

关键词： boundary value problem boundary element method preconditioning iterative method fast Fourier transform parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

A parallel method for unconstrained discrete-time optimal control problems

引用

SIAM JOURNAL ON OPTIMIZATION 1996年第2期6卷 488-512页

作者： Ralph, D Department of Mathematics University of Melbourne Parkville Vic. 3052 Australia

Discrete-time optimal control (DTOC) problems are potentially large, highly structured optimization problems. In such a problem there are N + 1 stages, or time steps;corresponding to time step i are a vector of control variables u(i) and a vector of state variables x(i). When N is large relative to the dimension of each u(i) and x(i), efficient methods for unconstrained DTOC problems have long been known. We propose a parallel method for unconstrained DTOC problems via stagewise decomposition: given J > 1 processors and I-1 = 1 < I-2 < ... < I-J+1 = N + 1, J subproblems are formed by grouping stages 1 to I-2 - 1, I-2 to I-3 - 1, and so on. The subproblems are independent except fur separator variables, x(I2), ..., x(IJ), which are treated specially to retain the quadratic local convergence typical of serial DTOC algorithms. Computational examples using the Intel Paragon OSF/1 computer are given.

关键词： discrete-time optimal control stagewise decomposition parallel algorithm quadratic convergence MIMD computer

来源：评论

学校读者我要写书评

暂无评论

UNIFIED parallel algorithm FOR GRID ADAPTATION ON A MULTIPLE-INSTRUCTION MULTIPLE-DATA ARCHITECTURE

引用

AIAA JOURNAL 1994年第9期32卷 1800-1807页

作者： VIDWANS, A KALLINDERIS, Y UNIV TEXAS DEPT AEROSP ENGN & ENGN MECHAUSTINTX 78712

A unified parallel algorithm for grid adaptation by local refinement/coarsening is presented. It is designed to be independent from the type of the grid. This is achieved by employing a generic data template that can be configured to capture the data structures for any computational grid regardless of structure and dimensionality. Furthermore, the algorithm itself is specified in terms of generic parallel primitives that are completely independent of the underlying parallel architecture. The unified parallel algorithm is employed for dynamic adaptation of three-dimensional unstructured tetrahedral grids on a partitioned memory multiple-instruction multiple-data architecture. Performance results are presented for the Intel iPSC/860.

关键词： parallel algorithm Unstructured Grid Blast Wave Navier Stokes algorithm parallel Computers parallel Computing Transonic Flow Interprocessor Communication Adaptive Grid algorithms Navier Stokes Equations

来源：评论

学校读者我要写书评

暂无评论

AN OPTIMAL parallel algorithm FOR GENERATING PERMUTATIONS IN MINIMAL CHANGE ORDER

引用

parallel COMPUTING 1994年第3期20卷 353-361页

作者： TSAY, JC LEE, WP Institute of Computer Science and Information Engineering College of Engineering National Chiao Tung University Hsinchu Taiwan 30050

Permutation generation is an important problem in combinatorial computing. In this paper we present an optimal parallel algorithm to generate all N! permutations of N objects. The algorithm is designed to be executed on a very simple computation model that is a linear array with N identical processors. Because of the simplicity and regularity of the processors, the model is very suitable for VLSI implementation. Another advantageous characteristic of this design is that it can generate all the permutations in minimal change order.

关键词： MINIMAL CHANGE ORDER parallel algorithm PERMUTATION GENERATION VLSI

来源：评论

学校读者我要写书评

暂无评论

On the complexity of distributed network decomposition

引用

JOURNAL OF algorithmS-COGNITION INFORMATICS AND LOGIC 1996年第2期20卷 356-374页

作者： Panconesi, A Srinivasan, A NATL UNIV SINGAPORE DEPT INFORMAT SYST & COMP SCI SINGAPORE 119260 SINGAPORE

In this paper, we improve the bounds for computing a network decomposition distributively and deterministically. Our algorithm computes an (n(epsilon(n)), n(epsilon(n)))-decomposition in n(O(epsilon(n))) time, where epsilon(n) = 1/root log n. As a corollary we obtain improved deterministic bounds for distributively computing several graph structures such as maximal independent sets and Delta-vertrx colorings. We also show that the class of graphs G whose maximum degree is n(O(delta(n))) where delta(n) = 1/log log n is complete for the task of computing a near-optimal decomposition, i.e., a (log n,log n)-decomposition, in polylog(n) time. This is a corollary of a more general characterization, which pinpoints the weak points of existing network decomposition algorithms. Completeness is to be intended in the following sense: if we have an algorithm A that computes a near-optimal decomposition in polylog(n) time for graphs in G, then we can compute a near-optimal decomposition in polylog(n) time for all graphs. (C) 1996 Academic Press, Inc.

关键词： Independent set problem parallel algorithm parallel algorithms Complexity Corollary logarithm (log) Network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：