咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Performing BMMC permutations e... 收藏

Performing BMMC permutations efficiently on distributed-memory multiprocessors with MPI

与 MPI 在分布式记忆的多处理机上高效地执行 BMMC 排列

作     者:Cormen, TH Clippinger, JC 

作者机构:Dartmouth Coll Dept Comp Sci Hanover NH 03755 USA 

出 版 物:《ALGORITHMICA》 (算法)

年 卷 期:1999年第24卷第3-4期

页      面:349-370页

核心收录:

学科分类:08[工学] 0835[工学-软件工程] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:BMMC permutations affine transformations distributed-memory multiprocessors MPI 

摘      要:This paper presents an architecture independent method for performing BMMC permutations on multiprocessors with distributed memory. All interprocessor communication uses the MPI function *** () The number of elements and number of processors must he powers of 2, with at least one element per processor, and there is no inherent upper bound on the ratio of elements per processor. Our method transmits only data without transmitting any source or target indices, which conserves network bandwidth. When data is transmitted, the source and target processors implicitly agree on each other s identity and the indices of the elements being transmitted. A C-callable implementation of our method is available from Netlib. The implementation allows preprocessing (which incurs a modest cost) to he factored out for multiple runs of the same permutation, even if on different data. Data may be laid out in any one of several ways: processor-major, processor-minor, or anything in between. Experimental results indicate that our method works well compared with several other candidate methods on three different platforms. In particular, the slower the interconnection network, the greater the relative advantage of our method.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分