Electric power computing resources are distributed in the respective regions of provincial data centers and transmission and transformation equipment, and the scheduling efficiency of computing resources is limited by...
详细信息
In distributed training, deep neural networks (DNNs) are launched over multiple workers concurrently and aggregate their local updates on each step in bulk-synchronous parallel (BSP) training. However, BSP does not li...
详细信息
ISBN:
(纸本)9798350307924
In distributed training, deep neural networks (DNNs) are launched over multiple workers concurrently and aggregate their local updates on each step in bulk-synchronous parallel (BSP) training. However, BSP does not linearly scale-out due to high communication cost of aggregation. To mitigate this overhead, alternatives like Federated Averaging (FedAvg) and Stale-Synchronous parallel (SSP) either reduce synchronization frequency or eliminate it altogether, usually at the cost of lower final accuracy. In this paper, we present SelSync, a practical, low-overhead method for DNN training that dynamically chooses to incur or avoid communication at each step either by calling the aggregation op or applying local updates based on their significance. We propose various optimizations as part of SelSync to improve convergence in the context of semi-synchronous training. Our system converges to the same or better accuracy than BSP while reducing training time by up to 14x.
The widespread utilization of Internet of Things (IoT) devices has resulted in an exponential increase in data at the Internet's edges. This trend, combined with the rapid growth of machine learning (ML) applicati...
详细信息
ISBN:
(纸本)9798350369458;9798350369441
The widespread utilization of Internet of Things (IoT) devices has resulted in an exponential increase in data at the Internet's edges. This trend, combined with the rapid growth of machine learning (ML) applications, necessitates the execution of learning tasks across the entire spectrum of computing resources - from the device, to the edge, to the cloud. This paper investigates the execution of machine learning algorithms within the edge-cloud continuum, focusing on their implications from a distributedcomputing perspective. We explore the integration of traditional ML algorithms, leveraging edge computing benefits such as low-latency processing and privacy preservation, along with cloud computing capabilities offering virtually limitless computational and storage resources. Our analysis offers insights into optimizing the execution of machine learning applications by decomposing them into smaller components and distributing these across processing nodes in edge-cloud architectures. By utilizing the Apache Spark framework, we define an efficient task allocation solution for distributing ML tasks across edge and cloud layers. Experiments on a clustering application in an edgecloud setup confirm the effectiveness of our solution compared to highly centralized alternatives, in which cloud resources are extensively used for handling large volumes of data from IoT devices.
To enhance the high-reliability operation capability of reconfigurable battery energy storage systems, a distributed cooperative control method of reconfigurable battery energy storage based on consensus algorithms is...
详细信息
ISBN:
(纸本)9798350377477;9798350377460
To enhance the high-reliability operation capability of reconfigurable battery energy storage systems, a distributed cooperative control method of reconfigurable battery energy storage based on consensus algorithms is proposed in this paper, which can improve the frequency and power characteristics of the system. Firstly, the mathematical model of virtual synchronous generator(VSG) control, the structure of reconfigurable battery energy storage system, and the parallel VSG distributed architecture are introduced;Secondly, combined with the mathematical model of VSG control, the overall scheme of frequency recovery and power output reasonable distribution of VSG parallel system is designed;Then, according to the structural characteristics of reconfigurable battery energy storage, a battery pack reconfiguration strategy is proposed, which can meet the power output demand through the free combination of battery packs;Finally, a simulation model is built in Matlab/Simulink environment, and the simulation results verify the correctness and effectiveness of the proposed method.
Many parallel and distributedcomputing research results are obtained in simulation, using simulators that mimic real-world executions on some target system. Each such simulator is configured by picking values for par...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Many parallel and distributedcomputing research results are obtained in simulation, using simulators that mimic real-world executions on some target system. Each such simulator is configured by picking values for parameters that define the behavior of the underlying simulation models it implements. The main concern for a simulator is accuracy: simulated behaviors should be as close as possible to those observed in the real-world target system. This requires that values for each of the simulator's parameters be carefully picked, or "calibrated," based on ground truth real-world executions. Examining the current state of the art shows that simulator calibration, at least in the field of parallel and distributedcomputing, is often undocumented (and thus perhaps often not performed) and, when documented, is described as a labor-intensive, manual process. In this work we evaluate the benefit of automating simulation calibration using simple algorithms. Specifically, we use a real-world case study from the field of High Energy Physics and compare automated calibration to calibration performed by a domain scientist. Our main finding is that automated calibration is on par with or significantly outperforms the calibration performed by the domain scientist. Furthermore, automated calibration makes it straightforward to operate desirable trade-offs between simulation accuracy and simulation speed.
Large power transformers typically use oil-immersed transformers, and the insulation oil can play a role in heat dissipation and insulation. By analyzing the gas composition in the oil, transformer failures can be pre...
详细信息
The rapid integration of digital technologies into power grids has transformed traditional grids into smart grids, increasing interdependence between cyber and physical components. This interconnection introduces vuln...
详细信息
Energy data, especially power data, has the advantages of covering a wide range of industries, high value density, and good accuracy, relying on energy big data can widely describe all kinds of terminal production and...
详细信息
ISBN:
(纸本)9798350366105;9798350366099
Energy data, especially power data, has the advantages of covering a wide range of industries, high value density, and good accuracy, relying on energy big data can widely describe all kinds of terminal production and life activities. However, the privacy computing platforms of various vendors often adopt different technical architectures and algorithm protocols in the early implementation process, which makes the implementation of various platforms have great differences, resulting in the direct interconnection between heterogeneous privacy computing platforms of various organizations. Aiming at the interoperability needs and problems of energy big data privacy computing platforms with different technical architectures, this paper decouple the management plane and data plane of the privacy computing platform, and in accordance with the principle of business priority, proposes a new privacy computing interconnection model of "unified business collaboration component + flexible configuration algorithm engine" suitable for energy big data. Through the design of the east-west and North-South interface framework of the privacy computing platform and the security access principle of the algorithm engine, the whole process control, cross-platform interconnection, and internal and external network penetration modeling and analysis of energy big data collaboration with the outside world are realized, which meets the key data security protection requirements of the State grid such as distributedcomputing and storage of data inside and outside the platform. It is of great significance to promote energy big data to empower external institutions such as governments, banks and operators.
Influence maximization (IM) is the problem of finding the k most influential nodes in a graph. We propose distributed-memory parallel algorithms for the two main kernels of a state-of-the-art implementation of one IM ...
详细信息
ISBN:
(数字)9798350352917
ISBN:
(纸本)9798350352924;9798350352917
Influence maximization (IM) is the problem of finding the k most influential nodes in a graph. We propose distributed-memory parallel algorithms for the two main kernels of a state-of-the-art implementation of one IM algorithm, influence maximization via martingales (IMM). The baseline relies on a bulk-synchronous parallel approach and uses replication to reduce communication and achieve approximate load balance, at the cost of synchronization and high memory requirements. By contrast, our method fully distributes the data, thereby improving memory scalability, and uses fine-grained asynchronous parallelism to improve network utilization and the cost of doing more communication. We show our design and implementation can achieve up to 29.6x speedup over the MPI-based state-of-the-art on synthetic and real-world network graphs. Moreover, ours is the first implementation that can run IMM to find influencers in the 'twitter' graph (41M nodes and 1.4B edges) in 200 seconds using 8K CPU cores of NERSC Perlmutter supercomputer.
Multiplying two sparse matrices (SpGEMM) is a common computational primitive used in many areas including graph algorithms, bioinformatics, algebraic multigrid solvers, and randomized sketching. distributed-memory par...
详细信息
ISBN:
(数字)9798350352917
ISBN:
(纸本)9798350352924;9798350352917
Multiplying two sparse matrices (SpGEMM) is a common computational primitive used in many areas including graph algorithms, bioinformatics, algebraic multigrid solvers, and randomized sketching. distributed-memory parallel algorithms for SpGEMM have mainly focused on sparsity-oblivious approaches that use 2D and 3D partitioning. Sparsity-aware 1D algorithms can theoretically reduce communication by not fetching nonzeros of the sparse matrices that do not participate in the multiplication. Here, we present a distributed-memory 1D SpGEMM algorithm and implementation. It uses MPI RDMA operations to mitigate the cost of packing/unpacking submatrices for communication, and it uses a block fetching strategy to avoid excessive fine-grained messaging. Our results show that our 1D implementation outperforms state-of-the-art 2D and 3D implementations within CombBLAS for many configurations, inputs, and use cases, while remaining conceptually simpler.
暂无评论