parallel tree contraction has been found to be a useful and quite powerful tool for the design of a wide class of efficient graph algorithms. We propose a corresponding technique for the parallel solution of increment...
详细信息
ISBN:
(纸本)9780897916714
parallel tree contraction has been found to be a useful and quite powerful tool for the design of a wide class of efficient graph algorithms. We propose a corresponding technique for the parallel solution of incremental problems. As our computational model, we assume a variant of the CRCW PRAM where we can dynamically activate processors by a forking *** consider a dynamic binary tree T of ≤ n nodes and unbounded depth. We describe a procedure, which we call the dynamic parallel tree contraction algorithm, which incrementally processes various parallel modification requests and queries:(1)parallel requests to add or delete leaves of T, or modify labels of internal nodes or leaves of T, and also(2) parallel tree contraction queries which require recomputingvalues at specified nodes. Each modification or query is with respect to a set of nodes U in *** dynamic parallel tree contraction algorithm is a randomized algorithm that takes O(log(|U|log n)) expected parallel time using O(|U|log n/log(|U|log n) processors. We give a large number of applications (with the same bounds), including:(a) maintaining the usual tree properties (such as number of ancestors, preorder, etc.),(b) Eulerian tour,(c) expression evaluation,(d) least common ancestor, and(e) canonical forms of ***, there where no known parallelalgorithms for incrementally maintaining and solving such problems in parallel time less than Θ(log n).In deriving our incremental algorithms, we solve a key subproblem, namely a processor activation problem, within the same asymptotic bounds, which may be useful in the design of other parallel incremental algorithms. This algorithm uses an interesting persistent parallel data structures involving a non-trivial *** a subsequent paper, we apply our dynamic parallel tree contraction technique to various incremental graph problems: maintaining various properties, (such as coloring, minimum covering set, maximum matching, etc.). of parall
Known polylog parallelalgorithms for the solution of linear systems and related problems require computation of the characteristic polynomial or related forms, which are known to be highly unstable in practice. Howev...
详细信息
ISBN:
(纸本)9780897916714
Known polylog parallelalgorithms for the solution of linear systems and related problems require computation of the characteristic polynomial or related forms, which are known to be highly unstable in practice. However, matrix factorizations of various types, bypassing computation of the characteristic polynomial, are used extensively in sequential numerical computations and are essential in many *** paper gives new parallel methods for various exact factorizations of several classes of matrices. We assume the input matrices are n × n with either integer entries of size ≤ 2no(1). We make no other assumption on the input. We assume the arithmetic PRAM model of parallel computation. Our main result is a reduction of the known parallel time bounds for these factorizations from O(log3n) to O(log2n). Our results are work efficient; we match the best known work bounds of parallelalgorithms with polylog time bounds, and are within a log n factor of the work bounds for the best known sequential algorithms for the same *** exact factorizations we compute for symmetric positive definite matrices include: 1. recursive factorization sequences and trees,2. LU factorizations,3. QR factorizations, and4. reduction to upper Hessenberg *** classes of matrices for which we can efficiently compute these factorizations include:1. dense matrices, in time O(log2n) with processor bound P(n) (the number of processors needed to multiply two n × n matrices in O(log n time),2. block diagonal matrices, in time O(log2b with P(b)n/b processors,3. sparse matrices which are s(n)-separable (recursive factorizations only), in time O(log2n) with P(s(n)) processors where s(n) is of the form n γ for 0 < γ < 1, and4. banded matrices, in parallel time O((logn) log b) with P(b)n/b *** factorizations also provide us similarly efficient algorithms for exact computation (given arbitrary rational matrices that need not be symmetric positive definite) of the following: 1
In the 3rd annualacmsymposium on parallelalgorithms and architectures, pp. 216-228 (JuIy 1991), we presented several new results in the theory of homogeneous multiprocessor scheduling. A directed acyclic graph (DAG...
详细信息
In the 3rd annualacmsymposium on parallelalgorithms and architectures, pp. 216-228 (JuIy 1991), we presented several new results in the theory of homogeneous multiprocessor scheduling. A directed acyclic graph (DAG) of tasks was to be scheduled. Tasks were assumed to be parallelizable-as more processors are applied to a task, the time taken to compute it decreases, yielding some speedup. Because of communication, synchronization and task scheduling overheads, this speedup increases less than linearly with the number of processors applied. The optimal scheduling problem is to determine the number of processors assigned to each task, and to the task sequencing, to minimise the finishing time. Using optimal control theory, in the special case where the speedup function of each task is p/sup /spl alpha// (where p is the amount of processing power applied to the task), a closed form solution for task graphs formed from parallel and series connections was derived. This paper considerably extends these techniques for arbitrary DAGs and applies them to matrix arithmetic compilation. The optimality conditions impose nonlinear constraints on the flow of processing power from predecessors to successors, and on the finishing times of siblings. This paper presents a fast algorithm for determining and solving these nonlinear equations. The algorithm utilizes the structure of the finishing time equations to efficiently run a conjugate gradient minimization leading to the optimal solution. The algorithm has been tested on a variety of DAGs. The results presented show that it is superior to alternative heuristic approaches.< >
The proceedings contain 47 papers. The topics discussed include: Fault-Tolerant Meshes with Small Degree;the verification of cache coherence protocols;fault diagnosis in a small constant number of parallel testing rou...
ISBN:
(纸本)0897915992
The proceedings contain 47 papers. The topics discussed include: Fault-Tolerant Meshes with Small Degree;the verification of cache coherence protocols;fault diagnosis in a small constant number of parallel testing rounds;tight bounds on the size of fault-tolerant merging and sorting networks with destructive faults;on Gazit and Miller�s parallel algorithm for planar separators: achieving greater efficiency through random sampling;parallel and output sensitive algorithms for combinatorial and linear algebra problems;efficient parallel shortest-paths in digraphs with a separator decomposition;components for computing and communications;highly efficient dictionary matching in parallel;optimal parallel two dimensional pattern matching;and parallel construction and query of suffix trees for two-dimensional matrices.
We introduce Autonomous SIMD (ASIMD) massively parallel architecture, then look at the flexibility, cost, and effectiveness of MIMD and ASIMD parallel systems. We show that ASIMD systems such as the MasPar MP-1 and MP...
详细信息
暂无评论