We discuss parallelsortingalgorithms and their implementations suitable for cluster architectures in order to optimize cluster resources. We focus on the time spent in computation and the load balancing properties w...
详细信息
We discuss parallelsortingalgorithms and their implementations suitable for cluster architectures in order to optimize cluster resources. We focus on the time spent in computation and the load balancing properties when processors are running at different speeds, i.e. correlated by a multiplicative constant factor (our weak definition of heterogeneous platform). One scheme is under study: parallelsorting by sampling (either regular sampling technique introduced by Shi and Schaeffer [J. parallel Distrib. Comput. 14 (4) (1992) 361] or the over-partitioning scheme introduced by Li and Seveik [parallelsorting by over-partitioning, in: Proceedings of the Sixth Annual Symposium on parallelalgorithms and Architectures, ACM Press, New York, June 1994]). What is important in the paper is mainly the load balance factor and not necessary the execution time. It is clear that improved load balance leads to improved execution titre. The results presented in the paper demonstrate that load balancing for the case of computers with heterogeneous processing capacity is more challenging than for the homogeneous case. The survey, through the sorting case study, allow us to identify some algorithmic issues and software challenges to master heterogeneous cluster platforms in order to better utilize theta: data decomposition techniques, scheduling and load balancing methods. (C) 2002 Elsevier Science B.V. All rights reserved.
The paper considers the problem of parallel external sorting in the context of a form of heterogeneous clusters. We introduce two algorithms and we compare them to another one that we have previously developed Since m...
详细信息
ISBN:
(纸本)0769519199
The paper considers the problem of parallel external sorting in the context of a form of heterogeneous clusters. We introduce two algorithms and we compare them to another one that we have previously developed Since most common sort algorithms assume high-speed random access to all intermediate memory, they are unsuitable if the values to be sorted don't fit in main memory. This is the case for cluster computing platforms which are made of standard, cheap and scarce components. For that class of computing resources a good use of 110 operations compatible with the requirements of load balancing and computational complexity are the key, to success. We explore three techniques and show how they can be deployed for clusters with processor performances related by a multiplicative factor We validate the approaches in showing experimental results for the load balancing factor.
暂无评论