This paper investigates the parallel time and processor complexities of several searching problems involving Monge and Monge-composite arrays. We present array-searching algorithms for concurrent-read-concurrent-write...
详细信息
ISBN:
(纸本)0897913701
This paper investigates the parallel time and processor complexities of several searching problems involving Monge and Monge-composite arrays. We present array-searching algorithms for concurrent-read-concurrent-write (CRCW) PRAMs, concurrent-read-exclusive-write (CREW) PRAMs, hypercubes, cube-connected-cycles, and shuffle-exchange networks. All these algorithms run in optimal time, and their processor-time products are all within an O(lg n) factor of the worst-case sequential bounds. Several applications of these algorithms are also given. Two applications improve previous results substantially, and the others provide novel parallelalgorithms for problems not previously considered.
The problem of representing a set U = {u1, ..., um} of read-write variables on an n-node distributed memory parallel computer is considered. It is shown that U can be represented among the n nodes of a variant of the ...
详细信息
ISBN:
(纸本)0897913701
The problem of representing a set U = {u1, ..., um} of read-write variables on an n-node distributed memory parallel computer is considered. It is shown that U can be represented among the n nodes of a variant of the mesh-of-trees using O((m/n)polylog(m/n)) storage per node such that any n-tuple of variables may be accessed in O(log n(log log n)2) time in the worst case for m polynomial in n.
In this paper we consider the problem of generating random permutations on small parallel machines. The machines that we have in mind are shared memory machines with a constant number of processors such as the Sequent...
详细信息
ISBN:
(纸本)0897913701
In this paper we consider the problem of generating random permutations on small parallel machines. The machines that we have in mind are shared memory machines with a constant number of processors such as the Sequent Symmetry. We describe a parallel implementation of the 'shuffling' algorithm for generating a random permutation. If the hardware operates in a fair manner, this algorithm generates a fully random permutation. However, if the machine resolves contention in a malicious manner, then the algorithm does not generate permutations uniformly. We give almost tight bounds on the degree that an adversary can reduce the randomness. We also discuss the cost of locking data in the algorithm and present a method of generating random permutations with substantially reduced locking cost.
We present a parallel randomized algorithm for finding if two planar graphs are isomorphic. Assuming that we have a tree of separators for each planar graph, our algorithm takes O(log(n)) time with P = (n1.5&middo...
详细信息
ISBN:
(纸本)0897913701
We present a parallel randomized algorithm for finding if two planar graphs are isomorphic. Assuming that we have a tree of separators for each planar graph, our algorithm takes O(log(n)) time with P = (n1.5·√log(n)) processors with probability to fail of 1/n or less, where n is the number of vertices. The algorithms needs 2·log(m)·log(n) + O(log(n)) random bits. The number of random bits can be decreased to O(log(n)) by increasing the processors number to n3/2+Ε. This algorithm significantly improves the previous results of n4 processors.
The authors address the apparently difficult problem of doing parallel transitive closure when the (directed) graph is sparse and/or, only single-source information is desired. O(e) work is their target for the single...
详细信息
ISBN:
(纸本)0897913701
The authors address the apparently difficult problem of doing parallel transitive closure when the (directed) graph is sparse and/or, only single-source information is desired. O(e) work is their target for the single-source problem. When the graph is sparse, then the all-pairs transitive closure problem can be solved by performing a depth-first search from each node, taking O(ne) time;that is their target for the all-pairs problem. The authors do not reach either target, except for the all-pairs case when e is fairly large. However, they make significant progress, in the sense that they have the first algorithms that simultaneously use time much less than linear and use work that is less than M(n).
Space-efficient Information Dispersal Algorithm (IDA) [11] is applied to parallel communication in the hypercube. Let N denote the size of the network. Our communication scheme runs in 2·log N + 1 time using cons...
详细信息
ISBN:
(纸本)0897913701
Space-efficient Information Dispersal Algorithm (IDA) [11] is applied to parallel communication in the hypercube. Let N denote the size of the network. Our communication scheme runs in 2·log N + 1 time using constant size buffers. Its probability of successful routing is at least 1 - N-2.419·log N+1.5, proving Rabin's conjecture. The same scheme also tolerates O(N) random link failures with high probability. The scheme runs within the said time bound without long delay. On-line and efficient wire testing and replacement on the hypercube can be realized if our fault-tolerant routing scheme is used. Let α denote the total number of links in the hypercube. It is shown that ≈ α/352 wires can be disabled simultaneously without disrupting the ongoing computation or degrading the routing performance much.
Multithreading has been proposed as an architectural strategy for tolerating latency in multiprocessors and, through limited empirical studies, shown to offer promise. This paper develops an analytical model of multit...
详细信息
ISBN:
(纸本)0897913701
Multithreading has been proposed as an architectural strategy for tolerating latency in multiprocessors and, through limited empirical studies, shown to offer promise. This paper develops an analytical model of multithreaded processor behavior based on a small set of architectural and program parameters. The model gives rise to a large Markov chain, which is solved to obtain a formula for processor efficiency in terms of the number of threads per processor, the remote reference rate, the latency, and the cost of switching between threads. It is shown that a multithreaded processor exhibits three operating regimes: linear (efficiency is proportional to the number of threads), transition, and saturation (efficiency depends only on the remote reference rate and switch cost). Formulae for regime boundaries are derived. The model is embellished to reflect cache degradation due to multithreading, using an analytical model of cache behavior, demonstrating that returns diminish as the number threads becomes large. Predictions from the embellished model correlate well with published empirical measurements. Prescriptive use of the model under various scenarios indicates that multithreading is effective, but the number of useful threads per processor is fairly small.
Let P be a simple rectilinear convex polygon of size O(n) inside which lie n pairwise disjoint rectangular rectilinear obstacles. We provide parallel techniques for computing rectilinear shortest paths that avoid the ...
详细信息
ISBN:
(纸本)0897913701
Let P be a simple rectilinear convex polygon of size O(n) inside which lie n pairwise disjoint rectangular rectilinear obstacles. We provide parallel techniques for computing rectilinear shortest paths that avoid the set of obstacles in P. Specifically, we compute descriptions of shortest paths in O(log2 n) time, with O(n2/log2n) processors in the CREWPRAM model if source and destination are on the boundary of P, with O(n2/log n) processors if the source is an obstacle vertex and the destination a vertex of P, and with O(n2) processors if both source and destination are obstacle vertices. The descriptions we compute enable one processor to obtain the path length of any query pair of vertices in constant time, or O(n/log n) processors to retrieve the shortest path itself in logarithmic time. If the two query points are arbitrary rather than vertices, then one processor takes O(log n) time (instead of constant time) for finding the path length. A number of other related shortest paths problems are solved. The techniques we use involve a fast computation of separator staircases and, most importantly, a scheme for partitioning the obstacles' boundaries into families in a way that ensures that the resulting path length matrices have a monotonicity property that is apparently absent before applying our partitioning scheme.
暂无评论