The distributed single-source shortest paths problem is one of the most fundamental and central problems in the message-passing distributed computing. Classical Bellman-Ford algorithm solves it in O(n) time, where n i...
详细信息
The distributed single-source shortest paths problem is one of the most fundamental and central problems in the message-passing distributed computing. Classical Bellman-Ford algorithm solves it in O(n) time, where n is the number of vertices in the input graph G. Peleg and Rubinovich [49] showed a lower bound of (Omega) over tilde (D + root n) for this problem, where D is the hop-diameter of G. Whether or not this problem can be solved in o(n) time when D is relatively small is a major open question. Despite intensive research [10, 17, 33, 41, 45] that yielded near-optimal algorithms for the approximate variant of this problem, no progress was reported for the original problem. In this article, we answer this question in the affirmative. We devise an algorithm that requires O((n log n)(5/6)) time, for D = O(root n logn), and O(D-1/3 . (n log n)(2/3)) time, for larger D. This running time is sublinear in n in almost the entire range of parameters, specifically, for D = o(n/log(2) n). We also generalize our result in two directions. One is when edges have bandwidth b >= 1, and the other is the s-sources shortest paths problem. For both problems, our algorithm provides bounds that improve upon the previous state-of-the-art in almost the entire range of parameters. In particular, we provide an all-pairs shortest paths algorithm that requires O(n(5/3) . log(2/3) n) time, even for b <= 1, for all values of D. We also devise the first algorithm with non-trivial complexity guarantees for computing exact shortest paths in the multipass semi-streaming model of computation. From the technical viewpoint, our distributed algorithm computes a hopset G '' of a skeleton graph G' of G without first computing G' itself. We then conduct a Bellman-Ford exploration in G'. G '', while computing the required edges of G' on the fly. As a result, our distributed algorithm computes exactly those edges of G' that it really needs, rather than computing approximately the entire G'.
We introduce TeraHAC, a (1+ε)-approximate hierarchical agglomerative clustering (HAC) algorithm which scales to trillion-edge graphs. Our algorithm is based on a new approach to computing (1+ε)-approximate HAC, whic...
详细信息
We introduce TeraHAC, a (1+ε)-approximate hierarchical agglomerative clustering (HAC) algorithm which scales to trillion-edge graphs. Our algorithm is based on a new approach to computing (1+ε)-approximate HAC, which is a novel combination of the nearest-neighbor chain algorithm and the notion of (1+ε)-approximate HAC. Our approach allows us to partition the graph among multiple machines and make significant progress in computing the clustering within each partition before any communication with other partitions is *** evaluate TeraHAC on a number of real-world and synthetic graphs of up to 8 trillion edges. We show that TeraHAC requires over 100x fewer rounds compared to previously known approaches for computing HAC. It is up to 8.3x faster than SCC, the state-of-the-art distributed algorithm for hierarchical clustering, while achieving 1.16x higher quality. In fact, TeraHAC essentially retains the quality of the celebrated HAC algorithm while significantly improving the running time.
This keynote talk will provide an overview of a recent line of work [Rozhoň and Ghaffari at STOC 2020; Ghaffari, Harris, and Kuhn at FOCS 2018; and Ghaffari, Kuhn, and Maus at STOC 2017], which presented the first ef...
详细信息
ISBN:
(纸本)9781450389334
This keynote talk will provide an overview of a recent line of work [Rozhoň and Ghaffari at STOC 2020; Ghaffari, Harris, and Kuhn at FOCS 2018; and Ghaffari, Kuhn, and Maus at STOC 2017], which presented the first efficient deterministic network decomposition algorithm as well as a general derandomization result for distributed graph algorithms. Informally, the derandomization result shows that any (locally-checkable) graph problem that admits an efficient randomized distributed algorithm also admits an efficient deterministic distributed algorithm. These results resolve several central and decades-old open problems in distributed graph algorithms.
The distributed minimum spanning tree (MST) problem is one of the most central and fundamental problems in distributed graph algorithms. Kutten and Peleg devised an algorithm with running time O(D + root n . log* n), ...
详细信息
The distributed minimum spanning tree (MST) problem is one of the most central and fundamental problems in distributed graph algorithms. Kutten and Peleg devised an algorithm with running time O(D + root n . log* n), where D is the hop diameter of the input n-vertexm-edge graph, and with message complexity O(m + n(3/2)). Peleg and Rubinovich showed that the running time of the algorithm of Kutten and Peleg is essentially tight and asked if one can achieve near-optimal running time together with near-optimal message complexity. In a recent breakthrough, Pandurangan et al. answered this question in the affirmative and devised a randomized algorithm with time (O) over tilde (D + root n) and message complexity (O) over tilde (m). They asked if such a simultaneous time- and message optimality can be achieved by a deterministic algorithm. In this article, building on the work of Pandurangan et al., we answer this question in the affirmative and devise a deterministic algorithm that computes MST in time O((D + root n) . logn) using O(m . logn + n logn . log* n) messages. The polylogarithmic factors in the time and message complexities of our algorithm are significantly smaller than the respective factors in the result of Pandurangan et al. In addition, our algorithm and its analysis are very simple and self-contained as opposed to rather complicated previous sublinear-time algorithms. Finally, we use our new algorithm to devise a randomized MST algorithm with running time (O) over tilde( mu(G, omega) + root n) and message complexity (O) over tilde(vertical bar E vertical bar), where mu-radius mu(G, omega) <= D is a graph parameter, which is typically much smaller than D. This improves a previous bound from Elkin.
暂无评论