检索结果-内蒙古大学图书馆

Efficient implementation of reduce-scatter in MPI

JOURNAL OF SYSTEMS ARCHITECTURE 2003年第3期49卷 89-108页

作者： Bernaschi, M Iannello, G Lauria, M CNR Ist Applicaz Calcolo I-00161 Rome Italy Univ Naples Federico II Dipartimento Informat & Sistemist I-80125 Naples Italy Ohio State Univ Dept Comp & Informat Sci Columbus OH 43210 USA

We discuss the efficient implementation of a collective operation called reduce-scatter, which is defined in the MPI standard. The reduce-scatter is equivalent to the combination of a reduction on vectors of length n with a scatter of the resulting n-vector to all processors. We describe the implementation issues and the performance characterization of two recently proposed algorithms for the reduce-scatter that have been proven to be highly efficient in theory under the assumption of fully connected parallel system. A performance comparison with existing mainstream implementations of the operation is presented which confirms the practical advantage of the new algorithms. Experiments show that the two algorithms have different characteristics which make them complementary in providing a performance gain over standard algorithms. Our study has been carried out on two different platforms: an SP2 and a Myrinet interconnected cluster of Pentium PRO. However, most of the results reported here are not specific for either MPI or the platforms used, and they hold in general for any message passing programming system. (C) 2003 Elsevier B.V. All rights reserved.

关键词： parallel algorithms collective communication primitives performance characterization MPI

来源：评论

学校读者我要写书评

暂无评论

Developing SPMD applications with load balancing

引用

parallel COMPUTING 2003年第6期29卷 743-766页

作者： Plastino, A Ribeiro, CC Rodriguez, N Pontificia Univ Catolica Rio de Janeiro Dept Comp Sci BR-22453900 Rio De Janeiro Brazil Univ Fed Fluminense Dept Comp Sci BR-24210240 Niteroi RJ Brazil

The central contribution of this work is SAMBA (Single Application, Multiple Load Balancing), a framework for the development of parallel SPMD (single program, multiple data) applications with load balancing. This framework models the structure and the characteristics common to different SPMD applications and supports their development. SAMBA also contains a library of load balancing algorithms. This environment allows the developer to focus on the specific problem at hand. Special emphasis is given to the identification of appropriate load balancing strategies for each application. Three different case studies were used to validate the functionality of the framework: matrix multiplication, numerical integration, and a genetic algorithm. These applications illustrate its ease of use and the relevance of load balancing. Their choice was oriented by the different load imbalance factors they present and by their different task creation mechanisms. The computational experiments reported for these case studies made possible the validation of SAMBA and the comparison, without additional reprogramming costs, of different load balancing strategies for each of them. The numerical results and the elapsed times measurements show the importance of using an appropriate load balancing algorithm and the associated reductions that can be achieved in the elapsed times. They also illustrate that the most suitable load balancing strategy may vary with the type of application and with the number of available processors. Besides the support to the development of SPMD applications, the facilities offered by SAMBA in terms of load balancing play also an important role in terms of the development of efficient parallel implementations. (C) 2003 Elsevier Science B.V. All rights reserved.

关键词： load balancing SPMD frameworks data parallelism parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Computation of AB² multiplication in GF(2^m) using low-complexity systolic architecture

引用

IEE PROCEEDINGS-CIRCUITS DEVICES AND SYSTEMS 2003年第2期150卷 119-123页

作者： Kim, NY Kim, HS Yoo, KY Kyungpook Natl Univ Dept Comp Engn Puk Gu Taegu 702701 South Korea Kyungil Univ Dept Comp Engn Kyungsan Kyungsangbukdo South Korea

An AB 2 operation is known as an efficient basic operation for public key cryptosystems over GF(2(m)), and various systolic arrays for performing AB(2) operations have already been proposed using a standard basis representation. However, these circuits have certain shortcomings for cryptographic application due to their high circuit complexity and long latency. Therefore, further research on an efficient AB(2) multiplication circuit is still needed. Accordingly, the authors present a new AB(2) algorithm and its systolic realisations in GF(2(m)). First, a new algorithm is proposed based on the MSB-first scheme using a standard basis representation. Thereafter, bitparallel and bit-serial systolic power multipliers are derived that exhibit a lower hardware complexity and smaller latency than conventional approaches. In addition, since the proposed architectures incorporate simplicity, regularity, modularity, and pipelinability, they are well suited to VLSI implementation and can be easily applied as a basic architecture for computing an inverse/ division operation and in crypto-processor chip design.

关键词： digital arithmetic systolic arrays parallel algorithms public key cryptography VLSI digital signal processing chips multiplying circuits computational complexity pipeline processing Galois fields AB/sup 2/ multiplication low-complexity sys Galois fields Systolic algorithms parallel algorithms public key cryptography Digital arithmetic Pipeline processing circuit complexity digital signal processing chips complexity classes Very large scale integration Multiplying circuits

来源：评论

学校读者我要写书评

暂无评论

Data-parallel polygonization

Data-parallel polygonization

引用

作者： Hoel, Erik G. Samet, Hanan Department of Computer Science Inst. of Advanced Computer Studies University of Maryland College Park MD 20742 United States ESRI 380 New York Street Redlands CA 92373-8100 United States

Data-parallel algorithms are presented for polygonizing a collection of line segments represented by a data-parallel bucket PMR quadtree, a data-parallel R-tree, and a data-parallel R+-tree. Such an operation is useful in a geographic information system (GIS). A sample performance comparison of the three data-parallel structures for this operation is also given. © 2003 Elsevier B.V.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for approximate string matching

A parallel algorithm for approximate string matching

引用

Proceedings of the International Conference on parallel and Distributed Processing Techniques and Applications

作者： Kaplan, Kathleen Burge III, Legand L. Garuba, Moses Howard University 2300 Sixth St. NW Washington DC 20059 United States

ISBN: (纸本)1892512416

This paper solves the NP problem of DNA string matching using heuristics and parallelism. The current methods for approximate matching are merely different versions of dynamic programming. Dynamic programming is O(n2), and does not consider one of the most important areas in technology: parallelism. The proposed algorithm uses parallelism to solve approximate matching. It has a best-case time complexity of O(n), and has better performance in practice than dynamic programming.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

An efficient approximation algorithm for finding a maximum clique using Hopfield network learning

引用

NEURAL COMPUTATION 2003年第7期15卷 1605-1619页

作者： Wang, RL Tang, Z Cao, QP Toyama Univ Fac Engn Toyama 9308555 Japan Tateyama Syst Inst Toyama 930001 Japan

in this article, we present a solution to the maximum clique problem using a gradient-ascent learning algorithm of the Hopfield neural network. This method provides a near-optimum parallel algorithm for finding a maximum clique. To do this, we use the Hopfield neural network to generate a near-maximum clique and then modify weights in a gradient-ascent direction to allow the network to escape from the state of near-maximum clique to maximum clique or better. The proposed parallel algorithm is tested on two types of random graphs and some benchmark graphs from the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS). The simulation results show that the proposed learning algorithm can find good solutions in reasonable computation time.

关键词： parallel algorithms FINDINGS theoretical computer science Approximation algorithms Hopfield neural networks learning algorithms computational time Discrete mathematics

来源：评论

学校读者我要写书评

暂无评论

parallel coupled thermomechanical simulation using hybrid domain decomposition

引用

International Conference on Computational Science and its Applications, ICCSA 2003

作者： Adamidis, Panagiotis A. Resch, Michael M. Allmandring 30 StuttgartD-70550 Germany

ISBN: (纸本)3540401555

This paper describes a new parallel algorithm for solving multiphysics problems. These kind of problems are very demanding in terms of CPU time and memory space, which are typically not available on a single processor. Using domain decomposition techniques it is possible to divide the original computational domain into subdomains, which need less memory and may be distributed onto the processors of a parallel computer. In this work, we introduce a hybrid domain decomposition, which uses nonoverlapping as well as overlapping partitions. The solution algorithm is a combination of the alternating Schwarz method and a block Gauss – Seidel scheme. This algorithm has been used to parallelize a Finite Element program able to calculate coupled thermomechanical problems, using a staggered solution strategy. Tests of the parallel algorithm with a realistic example show a rather good speed up. © Springer-Verlag Berlin Heidelberg 2003.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

An optimal parallel algorithm for c-vertex-ranking of trees

引用

14th International Symposium on algorithms and Computation, ISAAC 2003

作者： Kashem, Md. Abul Rahman, M. Ziaur Department of Computer Science and Engineering Bangladesh University of Engineering and Technology Dhaka1000 Bangladesh

ISBN: (纸本)9783540206958

For a positive integer c, a c-vertex-ranking of a graph G = (V,E) is a labeling of the vertices of G with integers such that, for any label i, deletion of all vertices with labels > i leaves connected components, each having at most c vertices with label i. The c-vertex-ranking problem is to find a c-vertex-ranking of a given graph using the minimum number of ranks. In this paper we give an optimal parallel algorithm for solving the c-vertex-ranking problem on trees that takes O(log2 n) parallel time using linear number of operations on the EREW PRAM model. © Springer-Verlag Berlin Heidelberg 2003.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

An optimal PRAM algorithm for a spanning tree on trapezoid graphs

引用

Journal of Applied Mathematics and Computing 2003年第1-2期12卷 21-29页

作者： Bera, Debashis Pal, Madhumangal Pal, Tapan K. Dept. Appl. Math. Ocean. Comp. Prog. Vidyasagar University Midnapore-721 102 India

Let G be a graph with n vertices and m edges. The problem of constructing a spanning tree is to find a connected subgraph of G with n vertices and n - 1 edges. In this paper, we propose an O(log n) time parallel algor... 详细信息

关键词： Design and analysis of algorithms parallel algorithms Spanning tree Trapezoid graphs

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for medial axis transformation

A parallel algorithm for medial axis transformation

引用

International Symposium on parallel and Distributed Processing and Applications, ISPA 2003

作者： Saha, Swagata Jana, Prasanta K. Department of Computer Science and Engineering Indian School of Mines Dhanbad826004 India

ISBN: (纸本)9783540376194

In this paper, we present a parallel algorithm for medial axis transformation, which is based on the parallel distance transformation presented in [1]. We first modify the parallel distance transformation as described in [1] and then apply this modified algorithm to develop a parallel algorithm for medial axis transformation. The algorithms are also simulated on two different input images. © Springer-Verlag Berlin Heidelberg 2003.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：