We present efficient algorithms for two all-to-all communication operations in message-passing systems: index (or all-to-all personalized communication) and concatenation (or all-to-all broadcast). We assume a model o...
详细信息
We present efficient algorithms for two all-to-all communication operations in message-passing systems: index (or all-to-all personalized communication) and concatenation (or all-to-all broadcast). We assume a model of a fully connected message-passing system, in which the performance of any point-to-point communication is independent of the sender-receiver pair. We also assume that each processor has k greater than or equal to 1 ports, through which it can send and receive k messages in every communication round. The complexity measures we use are independent of the particular system topology and are based on the communication start-up time, and on the communication bandwidth. In the index operation among n processors, initially, each processor has n blocks of data, and the goal is to exchange the ith block of processor j with the jth block of processor i. We present a class of index algorithms that is designed for all values of n and that features a trade-off between the communication start-up time and the data transfer time. This class of algorithms includes two special cases: an algorithm that is optimal with respect to the measure of the start-up time, and an algorithm that is optimal with respect to the measure of the data transfer time. We also present experimental results featuring the performance tuneability of our index algorithms on the IBM SP-1 parallel system. In the concatenation operation, among n processors, initially, each processor has one block of data, and the goal is to concatenate the n blocks of data from the n processors, and to make the concatenation result known to all the processors. We present a concatenation algorithm that is optimal, for most values of n, in the number of communication rounds and in the amount of data transferred.
XcalableMP (XMP) is a Partitioned Global Address Space (PGAS) language that is defined by the XMP Specification Working Group of the PC Cluster Consortium. This paper provides the implementation and evaluation of the ...
详细信息
ISBN:
(纸本)9781450351232
XcalableMP (XMP) is a Partitioned Global Address Space (PGAS) language that is defined by the XMP Specification Working Group of the PC Cluster Consortium. This paper provides the implementation and evaluation of the Fiber miniapp suite, which is primarily maintained by RIKEN Advanced Institute for Computational Science, on the basis of the local-view parallelization model using the coarray feature of XMP. In many cases, a coarray-based implementation can be obtained by replacing original Message Passing Interface (MPI) functions with coarray assignment statements. Herein, we demonstrate a method to rewrite irregular applications into the coarray-based style. Evaluation on the K computer using the Omni XMP compiler we have been developing shows that some XMP implementations are comparable to their original implementations, but there is performance degradation found in the others, which is due to the large overhead from allocating dynamic coarrays at runtime.
暂无评论