This paper presents a new algorithm implementing the Omega failure detector in the crash-recovery model. Contrary to previously proposed algorithms, this algorithm does not rely on the use of stable storage and is com...
详细信息
This paper presents a new algorithm implementing the Omega failure detector in the crash-recovery model. Contrary to previously proposed algorithms, this algorithm does not rely on the use of stable storage and is communication-efficient, i.e., eventually only one process (the elected leader) keeps sending messages. The algorithm relies on a nondecreasing local clock associated with each process. Since stable storage is not used to keep the identity of the leader in order to read it upon recovery, unstable processes, i.e., those that crash and recover infinitely often, output a special perpendicular to value upon recovery, and then agree with correct processes on the leader after receiving a first message from it. (C) 2009 Elsevier B.V. All rights reserved.
The proliferation of massive datasets has led to significant interests in distributed algorithms for solving large-scale machine learning ***,the communication overhead is a major bottleneck that hampers the scalabili...
详细信息
The proliferation of massive datasets has led to significant interests in distributed algorithms for solving large-scale machine learning ***,the communication overhead is a major bottleneck that hampers the scalability of distributed machine learning *** this paper,we design two communication-efficient algorithms for distributed learning *** first one is named EF-SIGNGD,in which we use the 1-bit(sign-based) gradient quantization method to save the communication ***,the error feedback technique,i.e.,incorporating the error made by the compression operator into the next step,is employed for the convergence *** second algorithm is called LE-SIGNGD,in which we introduce a well-designed lazy gradient aggregation rule to EF-SIGNGD that can detect the gradients with small changes and reuse the outdated ***-SIGNGD saves communication costs both in transmitted bits and communication ***,we show that LE-SIGNGD is convergent under some mild *** effectiveness of the two proposed algorithms is demonstrated through experiments on both real and synthetic data.
This work addresses the leader election problem in partially synchronous distributed systems where processes can crash and recover. More precisely, it focuses on implementing the Omega failure detector class, which pr...
详细信息
This work addresses the leader election problem in partially synchronous distributed systems where processes can crash and recover. More precisely, it focuses on implementing the Omega failure detector class, which provides a leader election functionality, in the crash-recovery failure model. The concepts of communication efficiency and near-efficiency for an algorithm implementing Omega are defined. Depending on the use or not of stable storage, the property satisfied by unstable processes, i.e., those that crash and recover infinitely often, varies. Two algorithms implementing Omega are presented. In the first algorithm, which is communication-efficient and uses stable storage, eventually and permanently unstable processes agree on the leader with correct processes. In the second algorithm, which is near-communication-efficient and does not use stable storage, processes start their execution with no leader in order to avoid the disagreement among unstable processes, that will agree on the leader with correct processes after receiving a first message from the leader. (C) 2011 Elsevier Inc. All rights reserved.
Multi-UAV passive localization via received signal strength (RSS) is extremely important for wide applications such as rescue and battlefield combat. However, the energy consumption of UAVs is a key issue in this UAVs...
详细信息
Multi-UAV passive localization via received signal strength (RSS) is extremely important for wide applications such as rescue and battlefield combat. However, the energy consumption of UAVs is a key issue in this UAVs-enabled application. Usually, the communication overhead plays an important role in the energy consumption. To address this problem, we design two distributed methods for this multi-UAV system with considerable performance under low communication overhead. Firstly, a distributed majorize-minimization (DMM) method is proposed. To accelerate its convergence, a tight upper bound of the objective function from the primary one is derived. Furthermore, a distributed estimation scheme using the Fisher information matrix (DEF) is presented, only requiring one round of communication between edge UAVs and central UAV. Simulation results show that the proposed DMM outperforms the existing distributed iterative methods in terms of root of mean square error (RMSE) under low communication overhead. Moreover, the most communication-effective DEF with local search estimation performs much better than the proposed DMM in terms of RMSE, but has a higher computational complexity.
There has been surprisingly little work on algorithms for sorting strings on distributed-memory parallel machines. We develop efficientalgorithms for this problem based on the multi-way merging principle. These algor...
详细信息
ISBN:
(纸本)9781728168760
There has been surprisingly little work on algorithms for sorting strings on distributed-memory parallel machines. We develop efficientalgorithms for this problem based on the multi-way merging principle. These algorithms inspect only characters that are needed to determine the sorting order. Moreover, communication volume is reduced by also communicating (roughly) only those characters and by communicating repetitions of the same prefixes only once. Experiments on up to 1280 cores reveal that these algorithm are often more than five times faster than previous algorithms.
String sorting is an important part of tasks such as building index data structures. Unfortunately, current string sorting algorithms do not scale to massively parallel distributed-memory machines since they either ha...
详细信息
ISBN:
(纸本)9798400704161
String sorting is an important part of tasks such as building index data structures. Unfortunately, current string sorting algorithms do not scale to massively parallel distributed-memory machines since they either have latency (at least) proportional to the number of processors.. or communicate the data a large number of times (at least logarithmic). We present practical and efficientalgorithms for distributed-memory string sorting that scale to large p. Similar to state-of-the-art sorters for atomic objects, the algorithms have latency of about p(1/k) when allowing the data to be communicated k times. Experiments show good scaling behavior on a wide range of inputs on up to 49 152 cores. We achieve speedups of up to 5 over the current state-of-the-art distributed string sorting algorithms.
Federated learning (FL) enables training models at different sites and updating the weights from the training instead of transferring data to a central location and training as in classical machine learning. The FL ca...
详细信息
ISBN:
(纸本)9781665497473
Federated learning (FL) enables training models at different sites and updating the weights from the training instead of transferring data to a central location and training as in classical machine learning. The FL capability is especially important to domains such as biomedicine and smart grid, where data may not be shared freely or stored at a central location because of policy regulations. Thanks to the capability of learning from decentralized datasets, FL is now a rapidly growing research field, and numerous FL frameworks have been developed. In this work we introduce APPFL, the Argonne Privacy-Preserving Federated Learning framework. APPFL allows users to leverage implemented privacy-preserving algorithms, implement new algorithms, and simulate and deploy various FL algorithms with privacy-preserving techniques. The modular framework enables users to customize the components for algorithms, privacy, communication protocols, neural network models, and user data. We also present a new communication-efficient algorithm based on an inexact alternating direction method of multipliers. The algorithm requires significantly less communication between the server and the clients than does the current state of the art. We demonstrate the computational capabilities of APPFL, including differentially private FL on various test datasets and its scalability, by using multiple algorithms and datasets on different computing environments.
暂无评论