We address in this paper the parallelization of a recursive algorithm for large scale triangular matrix inversion based on the 'Divide and Conquer' (D&C) paradigm. A set of different versions of an origina...
详细信息
We address in this paper the parallelization of a recursive algorithm for large scale triangular matrix inversion based on the 'Divide and Conquer' (D&C) paradigm. A set of different versions of an original sequential algorithm are first presented. A theoretical performance study permits to establish an accurate comparison between the designed algorithms. Afterwards, we develop in the second part of the paper, an optimal parallel avoiding-communication algorithm for a given number of available homogeneous and heterogeneous processors. To reach this target, we use a so called 'non equitable and incomplete' version of the D&C paradigm consisting in recursively decomposing the original problem into two sub-problems of non equal sizes, then decomposing only one sub-problem in the same previous manner. The theoretical study is validated by a series of experiments achieved on three target platforms, namely an 8-core shared memory machine, a distributed memory cluster and a heterogeneous CPU-GPU cluster. The obtained results permit to illustrate the interest of the contribution.
This paper studies the parallelization of a recursive algorithm for triangular matrix inversion (TMI), using the "divide and conquer" paradigm. For a (large scale) matrix of size n = m2(k) (m,k greater than ...
详细信息
This paper studies the parallelization of a recursive algorithm for triangular matrix inversion (TMI), using the "divide and conquer" paradigm. For a (large scale) matrix of size n = m2(k) (m,k greater than or equal to 1) and p = 2(q) (less than or equal to n/2) available processors, we first construct an adequate 2-phases task segmentation and inducing a balanced layered task graph. Then. we design a greedy scheduling leading to a cost optimal parallel algorithm, i.e. whose efficiency is equal to 1 for large n. The practical interest of the contribution is proven through an experimental study of two versions of the original algorithm on an IBM SP1 distributed memory multiprocessor. (C) 2001 Published by Elsevier Science B.V.
We address in this paper the parallelization of a recursive algorithm for triangular matrix inversion (TMI) based on the 'Divide and Conquer' (D&C) paradigm. A series of different versions of an original s...
详细信息
ISBN:
(纸本)9788360810484
We address in this paper the parallelization of a recursive algorithm for triangular matrix inversion (TMI) based on the 'Divide and Conquer' (D&C) paradigm. A series of different versions of an original sequential algorithm are first presented. A theoretical performance study permits to establish an accurate comparison between the designed algorithms. Afterwards, we develop an optimal parallel communication-free algorithm targeting a heterogeneous environment involving processors of different speeds. For this purpose, we use a non equitable and incomplete version of the D&C paradigm consisting in recursively decomposing the original TMI problem in two subproblems of non equal sizes, then decomposing only one subproblem and so on. The theoretical study is validated by a series of experiments achieved on two platforms, namely an 8-core shared memory machine and a distributed memory cluster of 16 nodes. The obtained results permit to illustrate the interest of the contribution.
暂无评论