The proceedings contain 155 papers. The topics discussed include: almost Chor-Goldreich sources and adversarial random walks;a new Berry-Esseen Theorem for expander walks;near-optimal derandomization of medium-width b...
ISBN:
(纸本)9781450399135
The proceedings contain 155 papers. The topics discussed include: almost Chor-Goldreich sources and adversarial random walks;a new Berry-Esseen Theorem for expander walks;near-optimal derandomization of medium-width branching programs;approximating iterated multiplication of stochastic matrices in small space;when Arthur has neither random coins nor time to spare: superfast derandomization of proof systems;exact phase transitions for stochastic block models and reconstruction on trees;parallel discrete sampling via continuous walks;sampling from convex sets with a cold start using multiscale decompositions;on regularity lemma and barriers in streaming and dynamic matching;optimal eigenvalue approximation via sketching;(noisy) gap cycle counting strikes back: random order streaming lower bounds for connected components and beyond;chaining, group leverage score overestimates, and fast spectral hypergraph sparsification;locally consistent decomposition of strings with applications to edit distance sketching;and sublinear time algorithms and complexity of approximate maximum matching.
The current paper provides preliminary statements of the panelists ahead of a panel discussion at the acm SPAA 2021 conference on the topic: algorithm-friendly architecture versus architecture-friendly algorithms. ...
详细信息
We present the first near-linear work and poly-logarithmic depth algorithm for computing a minimum cut in an undirected graph. Previous parallelalgorithms with poly-logarithmic depth required at least quadratic work ...
详细信息
We present the first near-linear work and poly-logarithmic depth algorithm for computing a minimum cut in an undirected graph. Previous parallelalgorithms with poly-logarithmic depth required at least quadratic work in the number of vertices. In a graph with n vertices and m edges, our randomized algorithm computes the minimum cut with high probability in O(m log(4) n) work and O(log(3) n) depth. This result is obtained by parallelizing a data structure that aggregates weights along paths in a tree, in addition exploiting the connection between minimum cuts and approximate maximum packings of spanning trees. In addition, our algorithm improves upon bounds on the number of cache misses incurred to compute a minimum cut.
Sequential algorithms for the Stable Matching Problem are often too slow in the context of some large scale applications like switch scheduling. parallelarchitectures can offer a notable reduction in complexity. We p...
详细信息
Motivated from parallel network mapping, we provide efficient query complexity and round complexity bounds for graph reconstruction using distance queries, including a bound that improves a previous sequential complex...
详细信息
We (nearly) settle the time complexity for computing vertex faulttolerant (VFT) spanners with optimal sparsity (up to polylogarithmic factors). VFT spanners are sparse subgraphs that preserve distance information, up ...
详细信息
ISBN:
(纸本)9781450392648
We (nearly) settle the time complexity for computing vertex faulttolerant (VFT) spanners with optimal sparsity (up to polylogarithmic factors). VFT spanners are sparse subgraphs that preserve distance information, up to a small multiplicative stretch, in the presence of vertex failures. These structures were introduced by [Chechik et al., STOC 2009] and have received a lot of attention since then. Recent work provided algorithms for computing VFT spanners with optimal sparsity but in exponential runtime. The first polynomial time algorithms for these structures have been given by [Bodwin, Dinitz and Robelle, SODA 2021]. Their algorithms, as all other prior algorithms, are greedy and thus inherently sequential. We provide algorithms for computing nearly optimal.. -VFT spanners for any n-vertex n-edge graph, with near optimal running time in several computational models: (i) A randomized sequential algorithm with a runtime of e.. (..) (i.e., independent in the number of faults..). The state-of-the-art time bound is e.. (.. 1-1/.. center dot..2+1/.. +.. 2..) by [Bodwin, Dinitz and Robelle, SODA 2021]. (ii) A distributed congest algorithm of e.. ( 1) rounds. Improving upon [Dinitz and Robelle, PODC 2020] that obtained FT spanners with near-optimal sparsity in e.. (.. 2) rounds. (iii) A PRAM (CRCW) algorithm with e.. (..) work and e.. (1) depth. Prior bounds implied by [Dinitz and Krauthgamer, PODC 2011] obtained sub-optimal FT spanners using e.. (.. 3..) work and e.. (.. 3) depth. An immediate corollary provides the first nearly-optimal PRAM algorithm for computing nearly optimal..-vertex connectivity certificates using polylogarithmic depth and near-linear work. This improves the state-of-the-art bounds of e.. (1) depth and.. (....) work, by [Karger and Motwani, STOC'93].
The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallelalgorithms for the binary-forking model;provable advantages for graph algorith...
ISBN:
(纸本)9781450380706
The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallelalgorithms for the binary-forking model;provable advantages for graph algorithms in spiking neural networks;algorithms for right-sizing heterogeneous data centers;efficient parallel self-adjusting computation;speed scaling with explorable uncertainty;efficient online weighted multi-level paging;paging and the address-translation problem;massively parallelalgorithms for distance approximation and spanners;efficient load-balancing through distributed token dropping;finding subgraphs in highly dynamic networks;near-optimal time-energy trade-offs for deterministic leader election;and efficient stepping algorithms and implementations for parallel shortest paths.
We prove that for every 3-player (3-prover) game, with binary questions and answers and value 0, such that the value of the n-fold parallel repetition of the game is at most n(-c). Along the way to proving this theor...
详细信息
ISBN:
(纸本)9781450392648
We prove that for every 3-player (3-prover) game, with binary questions and answers and value < 1, the value of the.. -fold parallel repetition of the game decays polynomially fast to 0. That is, for every such game, there exists a constant c > 0, such that the value of the n-fold parallel repetition of the game is at most n(-c). Along the way to proving this theorem, we prove two additional parallel repetition theorems for multiplayer (multiprover) games, that may be of independent interest: Playerwise Connected Games (with any number of players and any Alphabet size): We identify a large class of multiplayer games and prove that for every game with value < 1 in that class, the value of the n-fold parallel repetition of the game decays polynomially fast to 0. More precisely, our result applies for playerwise connected games, with any number of players and any alphabet size: For each player i, we define the graph G(i), whose vertices are the possible questions for that player and two questions x,x ' are connected by an edge if there exists a vector y of questions for all other players, such that both (x, y) and (x ' , y) are asked by the referee with non-zero probability. We say that the game is playerwise connected if for every i, the graphG(i) is connected. Our class of playerwise connected games is strictly larger than the class of connected games that was defined by Dinur, Harsha, Venkat and Yuen (ITCS 2017) and for which they proved exponential decay bounds on the value of parallel repetition. For playerwise connected games that are not connected, only inverse Ackermann decay bounds were previously known (Verbitsky 1996). Exponential Bounds for the Anti-Correlation Game: In the 3-player anti-correlation game, two out of three players are given 1 as input, and the remaining player is given 0. The two players whowere given 1 must produce different outputs in {0, 1}. We prove that the value of the n-fold parallel repetition of that game decays exponentially fast t
The maximum weighted matching (mwm) problem is one of the most well-studied combinatorial optimization problems in distributed graph algorithms. Despite a long development on the problem, and the recent progress of Fi...
详细信息
The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access-points, leading to thread serialisation, hindering parallelism. Aiming to address this challe...
详细信息
ISBN:
(纸本)9781450391467
The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access-points, leading to thread serialisation, hindering parallelism. Aiming to address this challenge, significant amount of work in the literature has proposed multi-access techniques that improve concurrent data structure parallelism. However, there is little work on analysing and modelling the execution behaviour of concurrent multi-access data structures especially in a shared memory setting. In this paper, we analyse and model the general execution behaviour of concurrent multi-access data structures in the shared memory setting. We study and analyse the behaviour of the two popular random access patterns: shared (Remote) and exclusive (Local) access, and the behaviour of the two most commonly used atomic primitives for designing lock-free data structures: Compare and Swap, and, Fetch and Add. We model the concurrent multi-accesses by splitting the thread execution procedure into five logical sessions: i) side-work, ii) access-point search iii) access-point acquisition, iv) access-point data acquisition and v) access-point data operation. We model the acquisition of an access-point, as a system of closed queuing networks with parallel servers, and data acquisition in terms of where the data is located within the memory system. We evaluate our model on a set of concurrent data structure designs including a counter, a stack and a FIFO queue. The evaluation is carried out on two state of the art multi-core processors: Intel Xeon Phi CPU 7290 with 72 physical cores and Intel Xeon E5-2695 with 14 physical cores. Our model is able to predict the throughput performance of the given concurrent data structures with 80% to 100% accuracy on both architectures.
暂无评论