An algorithm can be thought of as a set of indexed computations and if one computation uses data generated by another computation then this data dependence can be represented by the difference of their indexes (called...
详细信息
An algorithm can be thought of as a set of indexed computations and if one computation uses data generated by another computation then this data dependence can be represented by the difference of their indexes (called dependence vector). Many important algorithms are characterized by the fact that data dependencies are uniform, i.e., the values of the dependence vectors are independent of the indexes of computations. Linear schedules are a special class of schedules described by a linear mapping of computation indexes into time. This paper addresses the problem of identifying optimal linear schedules for uniform dependence algorithms so that their execution time is minimized. Procedures are proposed to solve this problem based on the mathematical solution of a nonlinear optimization problem. The complexity of these procedures is independent of the size of the algorithm. Actually, the complexity is exponential in the dimension of the index set of the algorithm and, for all practical purposes, very small due to the limited dimension of the index set of algorithms of practical interest. The results reported in this paper can be used to derive time-optimal systolic designs and applied in optimizing compilers to restructure programs at compile-time in order to maximally exploit available parallelism.
Parallelization of sequential programs primarily focuses on loop structures since the array index variables in a loop usually exhibit data dependency among them. When the data dependency relation is constant in terms ...
详细信息
ISBN:
(纸本)0818675799
Parallelization of sequential programs primarily focuses on loop structures since the array index variables in a loop usually exhibit data dependency among them. When the data dependency relation is constant in terms of distance, several compile time parallelization methods were introduced. On the other hand, when the data dependency relation is varying in distance, the compile time extraction of parallelism is much complicated. We propose a generalized parallelism extraction scheme for nestedloops. This method automatically converts a sequential loop into a nested parallel DOALL loop at compile time. Moreover, this algorithm can be applicable where the dependency relation is both constant and varying in distance. Our test results show the proposed scheme is superior to conventional methods when sufficiently large number of processors are provided.
暂无评论