Consider the following NP-hard problems: Given a graph G, find the minimum 2-edge connected and 2-vertex connected subgraphs spanning all vertices of G. The past few years have produced exciting sequential algorithms ...
详细信息
ISBN:
(纸本)9780897917179
Consider the following NP-hard problems: Given a graph G, find the minimum 2-edge connected and 2-vertex connected subgraphs spanning all vertices of G. The past few years have produced exciting sequential algorithms for approximating such minimum subgraphs [6, 7]. The approximation factors are improved from 2 down to 5/4 and 3/2 respectively. Yet the techniques involved are all based on augmenting depth-first-search trees and no similar progress has been carried to the parallel context. This paper presents NC algorithms to achieve approximation factors of 3/2 + Ε and 7/4 + Ε respectively without computing depth-first-search trees.
We implemented and measured several methods to perform BMMC permutations on the MasPar MP-2. Our results indicate that, except for certain types of permutations or very high virtual processor ratios, the best method o...
详细信息
We implemented and measured several methods to perform BMMC permutations on the MasPar MP-2. Our results indicate that, except for certain types of permutations or very high virtual processor ratios, the best method overall is the naive method but with virtual-processor numbers computed in Gray-code order. For some permutations, however, the naive method performs very poorly;the best method in these cases is an adaptation of the block BMMC algorithm for parallel disk systems in which the processor elements are treated as independent devices.
Using the ideas from the NC algorithm of Ben-Or and Tiwari [Journal of Complexity 6, 417-442, 1990], we develop a practical parallel algorithm that approximates the roots of a polynomial whose roots are all real. A ne...
详细信息
ISBN:
(纸本)089791483X
Using the ideas from the NC algorithm of Ben-Or and Tiwari [Journal of Complexity 6, 417-442, 1990], we develop a practical parallel algorithm that approximates the roots of a polynomial whose roots are all real. A new elementary proof of correctness is provided and the complexity of the algorithm is analyzed. A particular implementation of the algorithm that performs well in practice is described and its run-time behaviour is compared with the analytical predictions. Its performance is also compared with that of the root-finding algorithm in the PARI package.
We develop an algorithm for parallel disk sorting, whose I/O cost approaches the lower bound and that guarantees almost perfect overlap between I/O and computation. Previous algorithms have either suboptimal I/O volum...
详细信息
ISBN:
(纸本)9781581136616
We develop an algorithm for parallel disk sorting, whose I/O cost approaches the lower bound and that guarantees almost perfect overlap between I/O and computation. Previous algorithms have either suboptimal I/O volume or cannot guarantee that I/O and computations can always be overlapped. We give an efficient implementation that can (at least) compete with the best practical implementations but gives additional performance guarantees. For the experiments we have configured a state of the art machine that can sustain full bandwidth I/O with eight disks and is very cost effective.
We analyze universal routing protocols, that is, protocols that can be used for any communication pattern in any network, under a stochastic model of continuous message generation, In particular, we present two univer...
详细信息
ISBN:
(纸本)9780897918091
We analyze universal routing protocols, that is, protocols that can be used for any communication pattern in any network, under a stochastic model of continuous message generation, In particular, we present two universal protocols, a store-and-forward and a wormhole routing protocol, and characterize their performance by the following three parameters: the maximum message generation rate for which the protocol is stable, the expected delay of a message from generation to service, and the time the protocol needs to recover from worst-case scenarios. Both protocols yield significant performance improvements over all previously known continuous routing protocols. In addition, we present adaptations of our protocols to continuous routing in node-symmetric networks, butterflies, and meshes.
As the gap between the cost of communication (i.e., data movement) and computation continues to grow, the importance of pursuing algorithms which minimize communication also increases. Toward this end, we seek asympto...
详细信息
ISBN:
(纸本)9781450307437
As the gap between the cost of communication (i.e., data movement) and computation continues to grow, the importance of pursuing algorithms which minimize communication also increases. Toward this end, we seek asymptotic communication lower bounds for general memory models and classes of algorithms. Recent work [2] has established lower bounds for a wide set of linear algebra algorithms on a sequential machine and on a parallel machine with identical processors. This work extends these previous bounds to a heterogeneous model in which processors access data and perform floating point operations at differing speeds. We also present an algorithm for dense matrix multiplication which attains the lower bound.
Dynamic trees are a well-studied and fundamental building block of dynamic graph algorithms dating back to the seminal work of Sleator and Tarjan [STOC'81, (1981), pp. 114-122]. The problem is to maintain a tree s...
详细信息
ISBN:
(纸本)9798400704161
Dynamic trees are a well-studied and fundamental building block of dynamic graph algorithms dating back to the seminal work of Sleator and Tarjan [STOC'81, (1981), pp. 114-122]. The problem is to maintain a tree subject to online edge insertions and deletions while answering queries about the tree, such as the heaviest weight on a path, etc. In the parallel batch-dynamic setting, the goal is to process batches of edge updates work efficiently in low (polylog n) span. Two work-efficient algorithms are known: batch-parallel Euler Tour Trees by Tseng et al. [ALENEX'19, (2019), pp. 92-106] and parallel Rake-Compress (RC) Trees by Acar et al. [ESA'20, (2020), pp. 2:1-2:23]. Both however are randomized and work efficient in expectation. Several downstream results that use these data structures (and indeed to the best of our knowledge, all known workefficient parallel batch-dynamic graph algorithms) are therefore also randomized. In this work, we give the first deterministic work-efficient solution to the problem. Our algorithm maintains a parallel RC-Tree on n vertices subject to batches of k edge updates deterministically in worst-case O(k log(1 + n/k)) work and O(log n log log k) span on the Common-CRCW PRAM. We also show how to improve the span of the randomized algorithm from O(log n log* n) to O(log n). Lastly, as a result of our new deterministic algorithm, we also derandomize several downstream results that make use of parallel batch-dynamic dynamic trees, previously for which the only efficient solutions were randomized.
A family of deterministic asynchronous Write-All algorithms were studied to analyze the properties of the set of permutations proposed by Kanellakis and Shvartsman. The efficiency of the algorithms was measured in ter...
详细信息
A family of deterministic asynchronous Write-All algorithms were studied to analyze the properties of the set of permutations proposed by Kanellakis and Shvartsman. The efficiency of the algorithms was measured in terms of work acounted for all machine instructions executed by processors. It was found that the analytical results covered only a subset of the possible adversarial patterns of asynchrony. The analysis suggested that the proposed method yielded a faster construction of the Write-All algorithms compared to other methods.
暂无评论