检索结果-内蒙古大学图书馆

27th acm SIGACT-SIGOPS symposium on principles of distributed computing

ISBN: (纸本)9781595939890

the proceedings contain 68 papers. the topics discussed include: distributed computation of the mode;sublogarithimic distributed MIS algorithm for sparse graphs using Nash-Williams decomposition;a log-star distributed maximal independent set algorithm for growth-bounded graphs;a jamming-resistant MAC protocol for single-hop wireless networks;failure detectors in loosely named systems;every problem has a weakest failure detector;sharing is harder than agreeing;collaborative enforcement of firewall policies in virtual private networks;secure communication over radio channels;on tradeoff between network connectivity, phase complexity and communication complexity of reliable communication tolerating mixed adversary;distributed algorithms for ultrasparse spanners and linear size skeletons;and efficient distributed approximation algorithms via probabilistic tree embeddings.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Efficient bounded timestamping from standard synchronization primitives

引用

distributed computing 2025年 1-39页

作者： Bashari, Benyamin Jamadi, Ali Woelfel, Philipp Univ Calgary Comp Sci 2500 Univ Dr Calgary AB T2N 1N4 Canada

Bounded timestamping systems (Israeli and Li in Proceedings of the 28th annual IEEE symposium on Foundations of Computer Science (FOCS), pp 371-382, 1987;Dolev and Shavit in SIAM J Comput 26 (2):418-455, 1997) allow a temporal ordering of events in executions of concurrent algorithms. they are a fundamental and well-studied building block used in many shared-memory algorithms (Haldar and Vit & aacute;nyi in J acm, 49 (1):101-126, 2002;Afek et al. in acm Trans Program Lang Syst 16:939-953, 1994;Abrahamson in Proceedings of the 7th acm symposium on principles of distributed computing (PODC), pp 291-302, 1988;Bashari and Woelfel in Proceedings of the 40th acm symposium on principles of distributed computing (PODC), pp 545-555, 2021). A concurrent bounded timestamping system keeps track of m timestamps, which is usually greater or equal to the number of processes in the system, n. A process may, at any point, obtain a new timestamp, and later determine a total order of all process's most recent timestamps. Known bounded timestamping algorithms (Dolev and Shavit in SIAM J Comput 26(2):418-455, 1997;Dwork and Waarts in J acm 46(5):633-666, 1999;Dwork et al. in SIAM J Comput 28(5):1848-1874, 1999;Gawlick et al. in theory of computing and systems (ISTCS), pp 171-183, 1992;Israeli and Pinhasov in distributed algorithms, pp 95-109, 1992;Haldar and Vit & aacute;nyi in J acm 49(1):101-126, 2002) do not scale well in the number of processes as getting a new timestamp takes at least Omega(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (n)$$\end{document} steps. Moreover, a lower bound by Israeli and Li (Proceedings of the 28th annual IEEE symposium on foundations of computer science (FOCS), pp 371-382, 1987) implies that timestamps need to be represented by Omega(m)\documentclass[12pt]{minim

关键词： Timestamping Shared memory Concurrency distributed algorithms

来源：评论

学校读者我要写书评

暂无评论

WaterWise: Co-optimizing Carbon- and Water-Footprint Toward Environmentally Sustainable Cloud computing 25

WaterWise: Co-optimizing Carbon- and Water-Footprint Toward ...

引用

30th symposium on principles and Practice of Parallel Programming

作者： Jiang, Yankai Roy, Rohan Basu Kanakagiri, Raghavendra Tiwari, Devesh Northeastern Univ Boston MA 02115 USA Univ Utah Salt Lake City UT USA IIT Tirupati Tirupati Andhra Pradesh India

ISBN: (纸本)9798400714436

the carbon and water footprint of large-scale computing systems poses serious environmental sustainability risks. In this study, we discover that, unfortunately, carbon and water sustainability are at odds with each other - and, optimizing one alone hurts the other. Toward that goal, we introduce, WaterWise, a novel job scheduler for parallel workloads that intelligently co-optimizes carbon and water footprint to improve the sustainability of geographically distributed data centers.

关键词： Cloud computing Sustainability Geospatial Shifting

来源：评论

学校读者我要写书评

暂无评论

DORADD: Deterministic Parallel Execution in the Era of Microsecond-Scale computing 25

DORADD: Deterministic Parallel Execution in the Era of Micro...

引用

30th symposium on principles and Practice of Parallel Programming

作者： Liu, Zhengqing Unal, Musa Parkinson, Matthew J. Kogias, Marios Imperial Coll London London England Ecole Polytech Fed Lausanne Lausanne Switzerland Azure Res Austin TX USA

ISBN: (纸本)9798400714436

Deterministic parallelism is a key building block for distributed and fault-tolerant systems that offers substantial performance benefits while guaranteeing determinism. By studying existing deterministically parallel systems (DPS), we identify certain design pitfalls, such as batched execution and inefficient runtime synchronization, that preclude them from meeting the demands of mu s-scale and high-throughput distributed systems deployed in modern datacenters. We present DORADD, a deterministically parallel runtime with low latency and high throughput, designed for modern datacenter services. DORADD introduces a hybrid scheduling scheme that effectively decouples request dispatching from execution. It employs a single dispatcher to deterministically construct a dynamic dependency graph of incoming requests and worker pools that can independently execute requests in a work-conserving and synchronization-free manner. Furthermore, DORADD overcomes the single-dispatcher throughput bottleneck based on core pipelining. We use DORADD to build an in-memory database and compare it with Caracal, the current state-of-the-art deterministic database, via the YCSB and TPC-C benchmarks. Our evaluation shows up to 2.5x better throughput and more than 150x and 300x better tail latency in non-contended and contended cases, respectively. We also compare DO-RADD with Caladan, the state-of-the-art non-deterministic remote procedure call (RPC) scheduler, and demonstrate that determinism in DORADD does not incur any performance overhead.

关键词： parallel execution determinism runtime scheduling

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 14th annual acm symposium on principles of distributed computing

Proceedings of the 14th Annual ACM Symposium on Principles o...

引用

Proceedings of the 14th annual acm symposium on principles of distributed computing

the proceedings contains 27 papers. Topics discussed include distributed computing, asynchronous transfer mode networks, algorithms, network protocols, computer software, routing, and fault detection.

关键词： Data processing

来源：评论

学校读者我要写书评

暂无评论

WeiPipe: Weight Pipeline Parallelism for Communication-Effective Long-Context Large Model Training 25

WeiPipe: Weight Pipeline Parallelism for Communication-Effec...

引用

30th symposium on principles and Practice of Parallel Programming

作者： Lin, Junfeng Liu, Ziming You, Yang Wang, Jun Zhang, Weihao Zhao, Rong Tsinghua Univ Beijing Peoples R China Natl Univ Singapore Singapore Singapore CETHIK Grp Co Ltd Hangzhou Peoples R China Lynxi Technol Co Ltd Beijing Peoples R China

ISBN: (纸本)9798400714436

Training large language models (LLMs) has become increasingly expensive due to the rapid expansion in model size. Pipeline parallelism is a widely used distributed training technique. However, as LLMs with larger context become prevalent and memory optimization techniques advance, traditional PP methods encounter greater communication challenges due to the increased size of activations and gradients of activations. To address this issue, we introduce weight-pipeline parallelism (WeiPipe) that transitions from an activation-passing pipeline to a weight-passing pipeline. WeiPipe reduces communication costs and achieves a more balanced utilization by transmitting only weights and their gradients between workers in a pipeline manner. WeiPipe does not rely on collective communication primitives, thus ensuring scalability. We present four variations of WeiPipe parallelism, including WeiPipe-Interleave, which emphasizes communication efficiency, and WeiPipe-zero-bubble, discussing the potential for minimal bubble ratios. Our implementation of WeiPipe-Interleave, performed on up to 32 GPUs and tested in various model configurations, including large-context LLM training, demonstrates a significant improvement in throughput compared to state-of-the-art pipeline parallelism and fully sharded data parallelism with different underlying infrastructures, including NVLink connections within cluster with Ethernet among cluster, and PCIe within cluster and Ethernet among cluster. Additionally, WeiPipe also shows greater scalability in communicationconstrained scenarios compared to state-of-art strategies.

关键词： distributed deep learning pipeline parallelism large-scale training long-context training

来源：评论

学校读者我要写书评

暂无评论

Adaptive Parallel Training for Graph Neural Networks 25

Adaptive Parallel Training for Graph Neural Networks

引用

30th symposium on principles and Practice of Parallel Programming

作者： Ma, Kaihao Liu, Renjie Yan, Xiao Cai, Zhenkun Song, Xiang Wang, Minjie Li, Yichao Cheng, James Chinese Univ Hong Kong Hong Kong Peoples R China Southern Univ Sci & Technol Shenzhen Peoples R China Ctr Perceptual & Interact Intelligence Hong Kong Peoples R China Amazon Seattle WA USA AWS Shanghai AI Lab Shanghai Peoples R China

ISBN: (纸本)9798400714436

there are several strategies to parallelize graph neural network (GNN) training over multiple GPUs. We observe that there is no consistent winner (i.e., with the shortest running time), and the optimal strategy depends on the graph dataset, GNN model, training algorithm, and hardware configurations. As such, we design the APT system to automatically select efficient parallelization strategies for GNN training tasks. To this end, we analyze the trade-offs of the strategies and design simple yet effective cost models to compare their execution time and facilitate strategy selection. Moreover, we also propose a general abstraction of the strategies, which allows to implement a unified execution engine that can be configured to run different strategies. Our experiments show that APT usually chooses the optimal or a close to optimal strategy, and the training time can be reduced by over 2x compared with always using a single strategy. APT is open-source at https://***/kaihaoma/APT.

关键词： Graph Neural Networks distributed and Parallel Training Network Communication

来源：评论

学校读者我要写书评

暂无评论

Helios: Efficient distributed Dynamic Graph Sampling for Online GNN Inference 25

Helios: Efficient Distributed Dynamic Graph Sampling for Onl...

引用

30th symposium on principles and Practice of Parallel Programming

作者： Sun, Jie Shi, Zuocheng Su, Li Shen, Wenting Wang, Zeke Li, Yong Yu, Wenyuan Lin, Wei Wu, Fei He, Bingsheng Zhou, Jingren Zhejiang Univ Hangzhou Peoples R China Alibaba Grp Hangzhou Peoples R China Zhejiang Univ Shanghai Inst Adv Study Hangzhou Peoples R China Natl Univ Singapore Singapore Singapore

ISBN: (纸本)9798400714436

Online GNN inference has been widely explored by applications such as online recommendation and financial fraud detection systems, where even minor delays can result in significant financial impact. Real-time dynamic graph sampling enables online GNN inference to reflect the latest graph updates in real-world graphs. However, online GNN inference typically demands millisecond-level latency Service Level Objectives (SLOs) as its performance guarantees, which poses great challenges for existing dynamic graph sampling approaches based on graph databases. the issues mainly arise from two aspects: long tail latency due to imbalanced data-dependent sampling and large communication overhead incurred by distributed sampling. To address these issues, we propose Helios, an efficient distributed dynamic graph sampling service to meet the stringent latency SLOs. the key ideas of Helios are 1) pre-sampling the dynamic graph in an event-driven approach, and 2) maintaining a query-aware sample cache to build the complete K-hop sampling results locally for inference requests. Experiments on multiple datasets show that Helios achieves up to 67x higher serving throughput and up to 32x lower P99 query latency compared to baselines.

关键词： Graph Neural Network distributed System Online Inference

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 9th annual acm symposium on principles of distributed computing

Proceedings of the 9th Annual ACM Symposium on Principles of...

引用

Proceedings of the 9th annual acm symposium on principles of distributed computing

this conference proceedings contains 27 papers. the main subjects are shared memory multiprocessors, achievement of causal consistency, semantics of distributed services, storage management, message-passing systems, clock synchronization algorithms, high speed network control, analysis of communication protocols, reasoning about probabilistic algorithms, and totally asynchronous systems.

关键词： Computer Systems, Digital

来源：评论

学校读者我要写书评

暂无评论

PROCEEDINGS OF thE thIRD annual acm symposium ON principles OF distributed computing.

PROCEEDINGS OF THE THIRD ANNUAL ACM SYMPOSIUM ON PRINCIPLES ...

引用

Proceedings of the third annual acm symposium on principles of distributed computing.

作者： Anon

this conference proceedings contains 27 papers. the main topics discussed are: distributed computer systems, computer networks protocols, distributed database systems, and some aspects of automata theory. Some general... 详细信息

ISBN: (纸本)0897911431

关键词： COMPUTER SYSTEMS, DIGITAL

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：