The construction of distributed algorithms for matrixcomputations built on top of distributed data aggregation algorithms with randomized communication schedules is investigated. For this purpose, a new aggregation a...
详细信息
The construction of distributed algorithms for matrixcomputations built on top of distributed data aggregation algorithms with randomized communication schedules is investigated. For this purpose, a new aggregation algorithm for summing or averaging distributed values, the push-flow algorithm, is developed, which achieves superior resilience properties with respect to failures compared to existing aggregation methods. It is illustrated that on a hypercube topology it asymptotically requires the same number of iterations as the optimal all-to-all reduction operation and that it scales well with the number of nodes. Orthogonalization is studied as a prototypical matrix computation task. A new fault tolerant distributed orthogonalization method rdmGS, which can produce accurate results even in the presence of node failures, is built on top of distributed data aggregation algorithms. (C) 2013 Elsevier B.V. All rights reserved.
暂无评论