检索结果-内蒙古大学图书馆

5th International Conference on Numerical Methods and Application

作者： Lirkov, I Bulgarian Acad Sci Cent Lab Parallel Proc BU-1113 Sofia Bulgaria

ISBN: (纸本)3540006087

The numerical solution of 3D linear elasticity equations is considered. The problem is described by a coupled system of second order elliptic partial differential equations. This system is discretized by trilinear parallelepipedal finite elements. The Preconditioned Conjugate Gradient iterative method is used for solving of the large-scale linear algebraic systems arising after the Finite Element Method (FEM) discretization of the problem. Displacement decomposition technique is applied at the first step to construct a preconditioner using the decoupled block-diagonal part of the original matrix. Then circulant block-factorization is used for preconditioning of the obtained block-diagonal matrix. Both preconditioning techniques, displacement decomposition and circulant block-factorization, are highly parallelizable. A parallel algorithm is invented for the proposed preconditioner. The theoretical analysis of the execution time shows that the algorithm is highly efficient for coarse-grain parallel computer systems. A portable parallel FEM code based on MPI is developed. Numerical tests for real-life engineering problems in computational geomechanics are performed on a number of modern parallel computers: Cray T3E, Sunfire 6800, and Beowulf cluster. The reported speed-up and parallel efficiency well illustrate the parallel features of the proposed method and its implementation.

关键词： parallel algorithms PCG method preconditioner circulant matrix elasticity problem

来源：评论

学校读者我要写书评

暂无评论

Benchmarking explicit state parallel model checkers

Benchmarking explicit state parallel model checkers

引用

PDMC 2003, parallel and Distributed Model Checking (Satellite Workshop of CAV '03)

作者： Jones, Mike Mercer, Eric G. Bao, Tonglaga Kumar, Rahul Lamborn, Peter Verification and Validation Laboratory Department of Computer Science Brigham Young University Provo United States

This paper presents a set of benchmarks and metrics for performance reporting in explicit state parallel model checking algorithms. The benchmarks are selected for controllability, and the metrics are chosen to measure speedup and communication overhead. The benchmarks and metrics are used to compare two parallel model checking algorithms: partition and random walk. Implementations of the partition algorithm using synchronous and asynchronous communication are used. Metrics are reported for each benchmark and algorithm for up to 128 workstations using a network of dynamically loaded workstations. Empirical results show that load balancing becomes an issue for more than 32 workstations in the partition algorithm and that random walk is a reasonable, low overhead, approach forinding errors in large models. The synchronous implementation is consistently faster than the asynchronous. The benchmarks, metrics and results given here are intended to be a starting point for a larger discussion of performance reporting in parallel explicit state model checking. © 2003 Published by Elsevier Science B.V.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A randomized linear-work EREW PRAM algorithm to find a minimum spanning forest

引用

ALGORITHMICA 2003年第3期35卷 257-268页

作者： Poon, CK Ramachandran, V City Univ Hong Kong Dept Comp Sci Kowloon Hong Kong Peoples R China Univ Texas Dept Comp Sci Austin TX 78712 USA

We present a randomized EREW PRAM algorithm to find a minimum spanning forest in a weighted undirected graph. On an n-vertex graph the algorithm runs in o((log n)(1+epsilon)) expected time for any epsilon > 0 and performs linear expected work. This is the first linear-work, polylog-time algorithm on the EREW PRAM for this problem. This also gives parallel algorithms that perform expected linear work on two general-purpose models of parallel computation-the QSM and the BSP.

关键词： parallel algorithms minimum spanning tree EREW PRAM design of algorithms randomized algorithms

来源：评论

学校读者我要写书评

暂无评论

A parallel iterative decoding algorithm for zero-tail and tail-biting convolutional codes

A parallel iterative decoding algorithm for zero-tail and ta...

引用

Proceedings 2003 IEEE International Symposium on Information Theory (ISIT)

作者： Matsushima, Toshiyasu Matsushima, Tomoko K. Hirasawa, Shigeichi Waseda University Shinjuku Tokyo Japan Polytechnic University Sagamihara Japan

A parallel propagation algorithm was applied to the decoding of convolutional codes. The performance of the algorithm was demonstrated by a numerical method similar to the density evaluation.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel model checking for LTL, CTL*, and L2μ

Parallel model checking for LTL, CTL*, and L2μ

引用

PDMC 2003, parallel and Distributed Model Checking (Satellite Workshop of CAV '03)

作者： Leucker, Martin Somla, Rafal Weber, Michael IT Department Uppsala University Uppsala Sweden Lehrstuhl für Informatik II RWTH Aachen Aachen Germany

We describe a parallel model-checking algorithm for the fragment of the μ-calculus that allows one alternation of minimal and maximal fixed-point operators. This fragment is also known as L2μ. Since LTL and CTL* can be encoded in this fragment, we obtain parallel model checking algorithms for practically important temporal logics. Our solution is based on a characterization of this problem in terms of two player games. We exhibit the structure of their game graphs and show that we can iteratively work with game graphs that have the same special structure as the ones obtained for L1μ-formulae. Since good parallel algorithms for colouring game-graphs for L1μ-formulae exist, it is straightforward to implement this algorithm in parallel and good run-time results can be expected. © 2003 Published by Elsevier Science B.V.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

10 million unknowns:: Is it that big?

引用

IEEE ANTENNAS AND PROPAGATION MAGAZINE 2003年第2期45卷 43-58页

作者： Velamparambil, S Chew, WC Song, JM Univ Illinois Dept Elect & Comp Engn Urbana IL 61801 USA Ansoft Corp Boulder CO 80303 USA Iowa State Univ Dept Elect & Comp Engn Ames IA 50011 USA

At the Center for Computational Electromagnetics at the University of Illinois, we recently solved a very-large-scale electromagnetic scattering problem. We computed the bistatic radar cross-section of a full-size aircraft at 8 GHz, involving the solution of a dense matrix equation with nearly 10.2 million unknowns. We regarded this as the "ultimate test" of a massively parallel implementation of the Multilevel Fast Multipole Algorithm (MLFMA), called ScaleME. In this paper, we narrate the technical difficulties faced and the experience gained from a very informal point of view. We shall describe the various methods developed for surmounting each of the obstacles.

关键词： electromagnetic scattering radar cross sections Fast Multipole Method MLFMA parallel algorithms message passing integral equations matrix decomposition matrix inversion

来源：评论

学校读者我要写书评

暂无评论

Efficient implementation of reduce-scatter in MPI

引用

JOURNAL OF SYSTEMS ARCHITECTURE 2003年第3期49卷 89-108页

作者： Bernaschi, M Iannello, G Lauria, M CNR Ist Applicaz Calcolo I-00161 Rome Italy Univ Naples Federico II Dipartimento Informat & Sistemist I-80125 Naples Italy Ohio State Univ Dept Comp & Informat Sci Columbus OH 43210 USA

We discuss the efficient implementation of a collective operation called reduce-scatter, which is defined in the MPI standard. The reduce-scatter is equivalent to the combination of a reduction on vectors of length n with a scatter of the resulting n-vector to all processors. We describe the implementation issues and the performance characterization of two recently proposed algorithms for the reduce-scatter that have been proven to be highly efficient in theory under the assumption of fully connected parallel system. A performance comparison with existing mainstream implementations of the operation is presented which confirms the practical advantage of the new algorithms. Experiments show that the two algorithms have different characteristics which make them complementary in providing a performance gain over standard algorithms. Our study has been carried out on two different platforms: an SP2 and a Myrinet interconnected cluster of Pentium PRO. However, most of the results reported here are not specific for either MPI or the platforms used, and they hold in general for any message passing programming system. (C) 2003 Elsevier B.V. All rights reserved.

关键词： parallel algorithms collective communication primitives performance characterization MPI

来源：评论

学校读者我要写书评

暂无评论

Developing SPMD applications with load balancing

引用

parallel COMPUTING 2003年第6期29卷 743-766页

作者： Plastino, A Ribeiro, CC Rodriguez, N Pontificia Univ Catolica Rio de Janeiro Dept Comp Sci BR-22453900 Rio De Janeiro Brazil Univ Fed Fluminense Dept Comp Sci BR-24210240 Niteroi RJ Brazil

The central contribution of this work is SAMBA (Single Application, Multiple Load Balancing), a framework for the development of parallel SPMD (single program, multiple data) applications with load balancing. This framework models the structure and the characteristics common to different SPMD applications and supports their development. SAMBA also contains a library of load balancing algorithms. This environment allows the developer to focus on the specific problem at hand. Special emphasis is given to the identification of appropriate load balancing strategies for each application. Three different case studies were used to validate the functionality of the framework: matrix multiplication, numerical integration, and a genetic algorithm. These applications illustrate its ease of use and the relevance of load balancing. Their choice was oriented by the different load imbalance factors they present and by their different task creation mechanisms. The computational experiments reported for these case studies made possible the validation of SAMBA and the comparison, without additional reprogramming costs, of different load balancing strategies for each of them. The numerical results and the elapsed times measurements show the importance of using an appropriate load balancing algorithm and the associated reductions that can be achieved in the elapsed times. They also illustrate that the most suitable load balancing strategy may vary with the type of application and with the number of available processors. Besides the support to the development of SPMD applications, the facilities offered by SAMBA in terms of load balancing play also an important role in terms of the development of efficient parallel implementations. (C) 2003 Elsevier Science B.V. All rights reserved.

关键词： load balancing SPMD frameworks data parallelism parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Computation of AB² multiplication in GF(2^m) using low-complexity systolic architecture

引用

IEE PROCEEDINGS-CIRCUITS DEVICES AND SYSTEMS 2003年第2期150卷 119-123页

作者： Kim, NY Kim, HS Yoo, KY Kyungpook Natl Univ Dept Comp Engn Puk Gu Taegu 702701 South Korea Kyungil Univ Dept Comp Engn Kyungsan Kyungsangbukdo South Korea

An AB 2 operation is known as an efficient basic operation for public key cryptosystems over GF(2(m)), and various systolic arrays for performing AB(2) operations have already been proposed using a standard basis representation. However, these circuits have certain shortcomings for cryptographic application due to their high circuit complexity and long latency. Therefore, further research on an efficient AB(2) multiplication circuit is still needed. Accordingly, the authors present a new AB(2) algorithm and its systolic realisations in GF(2(m)). First, a new algorithm is proposed based on the MSB-first scheme using a standard basis representation. Thereafter, bitparallel and bit-serial systolic power multipliers are derived that exhibit a lower hardware complexity and smaller latency than conventional approaches. In addition, since the proposed architectures incorporate simplicity, regularity, modularity, and pipelinability, they are well suited to VLSI implementation and can be easily applied as a basic architecture for computing an inverse/ division operation and in crypto-processor chip design.

关键词： digital arithmetic systolic arrays parallel algorithms public key cryptography VLSI digital signal processing chips multiplying circuits computational complexity pipeline processing Galois fields AB/sup 2/ multiplication low-complexity sys Galois fields Systolic algorithms parallel algorithms public key cryptography Digital arithmetic Pipeline processing circuit complexity digital signal processing chips complexity classes Very large scale integration Multiplying circuits

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for approximate string matching

A parallel algorithm for approximate string matching

引用

Proceedings of the International Conference on parallel and Distributed Processing Techniques and Applications

作者： Kaplan, Kathleen Burge III, Legand L. Garuba, Moses Howard University 2300 Sixth St. NW Washington DC 20059 United States

ISBN: (纸本)1892512416

This paper solves the NP problem of DNA string matching using heuristics and parallelism. The current methods for approximate matching are merely different versions of dynamic programming. Dynamic programming is O(n2), and does not consider one of the most important areas in technology: parallelism. The proposed algorithm uses parallelism to solve approximate matching. It has a best-case time complexity of O(n), and has better performance in practice than dynamic programming.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：