We exhibit that, when given a classical Byzantine agreement protocol designed in the private-channel model, it is feasible to construct a quantum agreement protocol that can effectively handle a full-information adver...
详细信息
The proceedings contain 3 papers. The topics discussed include: a design framework for the simulation of distributed quantum computing;quantum mini-apps: a framework for developing and benchmarking quantum-HPC applica...
ISBN:
(纸本)9798400706431
The proceedings contain 3 papers. The topics discussed include: a design framework for the simulation of distributed quantum computing;quantum mini-apps: a framework for developing and benchmarking quantum-HPC applications;and demystifying HPC-quantum integration: it’s all about scheduling.
Processing large-scale graphs with billions to trillions of edges requires efficiently utilizing parallel systems. However, current graph processing engines do not scale well beyond a few tens of computing nodes becau...
详细信息
ISBN:
(纸本)9798400704352
Processing large-scale graphs with billions to trillions of edges requires efficiently utilizing parallel systems. However, current graph processing engines do not scale well beyond a few tens of computing nodes because they are oblivious to the communication cost variations across the interconnection hierarchy. We introduce GraphCube, a better approach to optimizing graph processing on large-scale parallel systems with complex interconnections. GraphCube features a new graph partitioning approach to achieve better load balancing and minimize communication overhead across multiple levels of the interconnection hierarchy. We evaluate GraphCube by applying it to fundamental graph operations performed on synthetic and real-world graph datasets. Our evaluation used up to 79,024 computing nodes and 1.2+ million processor cores. Our large-scale experiments show that GraphCube outperforms state-of-the-art parallel graph processing methods in throughput and scalability. Furthermore, GraphCube outperformed the top-ranked systems on the Graph 500 list.
In this paper we present a deterministic O( log logn)-round algorithm for the 2-ruling set problem in the Massively Parallel Computation (MPC) model with O-similar to (n) memory;this algorithm also runs in O( log logn...
详细信息
ISBN:
(纸本)9781450392624
In this paper we present a deterministic O( log logn)-round algorithm for the 2-ruling set problem in the Massively Parallel Computation (MPC) model with O-similar to (n) memory;this algorithm also runs in O( log logn) rounds in the Congested Clique model. This is exponentially faster than the fastest known deterministic 2-ruling set algorithm for these models, which is simply the Omicron(log Delta)-round deterministic Maximal Independent Set (MIS) algorithm due to Czumaj, Davies, and Parter (SPAA 2020). Our result is obtained by derandomizing the 2-ruling set algorithm of Kothapalli and Pemmaraju (FSTTCS 2012).
The proceedings contain 3 papers. The topics discussed include: mapping parallel matrix multiplication in GotoBLAS2 to the AMD versal ACAP for deep learning;exploring post quantum cryptography with quantum key distrib...
ISBN:
(纸本)9798400706448
The proceedings contain 3 papers. The topics discussed include: mapping parallel matrix multiplication in GotoBLAS2 to the AMD versal ACAP for deep learning;exploring post quantum cryptography with quantum key distribution for sustainable mobile network architecture design;and energy efficiency: a lattice Boltzmann study.
Programming asynchronous distributed systems is a challenging task in which consistency is often achieved by use of expensive coordination protocols like Paxos and 2PC. The CALM theorem, first conjectured by Hellerste...
详细信息
ISBN:
(纸本)9798400701276
Programming asynchronous distributed systems is a challenging task in which consistency is often achieved by use of expensive coordination protocols like Paxos and 2PC. The CALM theorem, first conjectured by Hellerstein, is one of the first results to challenge this practice by stating that a problem can have a consistent, coordination-free distributed implementation if (and only if) the problem is monotonic. This result was proven for queries and shown to extend beyond monotonic (yet monotonic-like) queries for data systems having specific knowledge about the partitioning of data over the network. In this article, we extend the latter results in several ways. We consider problems that can be modeled as mappings from distributed instances to distributed instances, enabling insights into a much broader range of problems than queries. Furthermore, our model can express arbitrary system configurations, allowing us to reason about the expressiveness of any particular distributed system and thereby revealing a nuanced gradient of problems with increasing coordination-needs. Finally, we apply our model to a recent question about the expressiveness of coordination-free queries, raised by Hellerstein and Alvaro.
We present an algorithm for distributed networks to efficiently find a small vertex cut in the CONGEST model. Given a positive integer k, our algorithm can, with high probability, either find.. vertices whose removal ...
详细信息
ISBN:
(纸本)9781450399135
We present an algorithm for distributed networks to efficiently find a small vertex cut in the CONGEST model. Given a positive integer k, our algorithm can, with high probability, either find.. vertices whose removal disconnects the network or return that such.. vertices do not exist. Our algorithm takes kappa(3) (O+ v..) rounds, where.. is the number of vertices in the network and.. denotes the network's diameter. This implies kappa(3) (O + v..) round complexity whenever kappa(3) = polylog(n). Prior to our result, a bound of (O) over tilde is known only when kappa = 1, 2 [Parter, Petruschka DISC'22]. For kappa(3) = 3, this bound can be obtained only by an.. (log..)-approximation algorithm [CensorHillel, Ghaffari, Kuhn PODC'14], and the only known exact algorithm takes.. O((kappa Delta D)(O kappa))) rounds, where. is the maximum degree [Parter DISC'19]. Our result answers an open problem by Nanongkai, Saranurak, and Yingchareonthawornchai [STOC'19].
In this article, we propose a distributed medium access control (MAC) protocol for the unmanned aerial vehicle (UAV) network with directional antennas. It uses a distributed learning algorithm to control each node'...
详细信息
ISBN:
(纸本)9798400701559
In this article, we propose a distributed medium access control (MAC) protocol for the unmanned aerial vehicle (UAV) network with directional antennas. It uses a distributed learning algorithm to control each node's communication parameters (such as sending rate) according to the observations of only 1-hop neighbors. Because of the asynchronous and decentralized nature of multi-agent deep distributed reinforcement learning (MADDRL), we adopt Multi Agent Deep Deterministic Policy Gradient (MADDPG) [3] to generate discrete actions for the MAC layer protocol. Under the individual and distributed observations of the environment, each node/agent can update their own states and rewards locally. Moreover, two major communication quality evaluation metrics - throughput (THR) and delay (DEL) are used to show the effectiveness and scalability of the distributed algorithm. From the results, the protocol using MADDPG shows more advantages than the general MAC schemes, including higher end-to-end THR, lower queuing DEL, and lower communication message exchange.
The proceedings contain 44 papers. The topics discussed include: FastFold: optimizing AlphaFold training and inference on GPU clusters;liger: interleaving intra- and inter-operator parallelism for distributed large mo...
ISBN:
(纸本)9798400704352
The proceedings contain 44 papers. The topics discussed include: FastFold: optimizing AlphaFold training and inference on GPU clusters;liger: interleaving intra- and inter-operator parallelism for distributed large model inference;optimizing collective communications with error-bounded lossy compression for GPU clusters;OsirisBFT: say no to task replication for scalable byzantine fault tolerant analytics;RELAX: durable data structures with swift recovery;a row decomposition-based approach for sparse matrix multiplication on GPUs;Tetris: accelerating sparse convolution by exploiting memory reuse on GPU;scaling up transactions with slower clocks;towards scalable unstructured mesh computations on shared memory many-cores;AGAThA: fast and efficient GPU acceleration of guided sequence alignment for long read mapping;and shared memory-contention-aware concurrent DNN execution for diversely heterogeneous system-on-chips.
Broadcast is a ubiquitous distributedcomputing problem that underpins many other system tasks. In static, connected networks, it was recently shown that broadcast is solvable without any node memory and only constant...
详细信息
暂无评论