版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Department of Computer and Information Science New Jersey Institute of Technology Newark NJ USA
出 版 物:《IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS》 (IEEE Trans Parallel Distrib Syst)
年 卷 期:1993年第4卷第12期
页 面:1372-1381页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:New Jersey Institute of Technology (SBR-421980)
主 题:CYCLIC SHIFT ODD-EVEN MERGE PARALLEL PREFIX PLUS-OR-MINUS-2(B)-ASCEND-DESCEND PLUS-OR-MINUS-2(B)-DESCEND 2(B) PERMUTATION PM2I INTERCONNECTION SIMD HYPERCUBE
摘 要:There is a wide class of parallel algorithms which require communications in the form of a 2(b) permutation and -2(b) permutation. (A 2(b) permutation of N elements, where N = 2(n) moves the element from position i to position i + 2(b) mod N. The inverse is called a -2(b) permutation.) We first examine the complexity of performing a 2(b) permutation of N elements on an N -PE SIMD hypercube. (The hypercube model assumes that all data movements are restricted to a single dimension at a time.) We prove that, given any mapping of the elements to processors, log N-b steps are needed to perform a 2(b) permutation. This lower bound suggests that if a parallel algorithm requires f(N) communication steps of type +/-2(b), it may require Omega(f(N) log N) steps on a SIMD hypercube. We identify an important class of parallel computations, called +/-2(b) descend, which includes Batcher s odd-even merge and many other algorithms. A computation in this class performs log N iterations on an N -element input A[O : N - 1]. For b = log N - 1,...,0, iteration b computes the new value of each A[i] as a function of A[i] A[i + 2(b)] and A[i - 2(b)]. We obtain an efficient O(log N) algorithm for this class on a SIMD hypercube. We discuss several problems which are solved elegantly using this general algorithm. We also study a related class of parallel computations called +/-2(b)-ascend. A computation in this class has a structure similar to +/-2(b)-descend, except that the iterations run in the increasing order of b. We develop a simple O(log(2) N/log log N) algorithm for this class on a SIMD hypercube, assuming Omega(log log N) space per PE. It remains open whether this class can be implemented on a SIMD hypercube in O(log N) time.