The virtual power plant is an important part of the new power system. Its IoT components need to collect and process energy equipment measurement data in real time, which inevitably uses stream computing. This paper i...
详细信息
This paper aims to present a multi-tiered approach to designing learning experiences in HPC for undergraduate students that significantly reinforce comprehension of CS topics while working with new concepts in paralle...
详细信息
ISBN:
(纸本)9798350333886
This paper aims to present a multi-tiered approach to designing learning experiences in HPC for undergraduate students that significantly reinforce comprehension of CS topics while working with new concepts in parallel and distributedcomputing. The paper will detail the experience of students working in the design, construction, and testing of a computing cluster including budgeting, hardware purchase and setup, software installation and configuration, interconnection networks, communication, benchmarking, and running parallel code using MPI and OpenMP. The case study of building a relatively low-cost, small-scale computing cluster that can be used as a template for CS senior projects or independent studies, also yielded an opportunity to involve students in the creation of teaching tools for parallelcomputing at many levels of the CS curriculum.
This paper proposes a Q-learning-based task allocation approach for wireless coded distributedcomputing systems with heterogeneous worker nodes. Task allocation in such systems is challenging due to the heterogeneity...
详细信息
Looking closely at the Top500 list of high-performance computers (HPC) in the world, it becomes clear that computing power is not the only number that has been growing in the last three decades. The amount of power re...
详细信息
ISBN:
(纸本)9783031396977;9783031396984
Looking closely at the Top500 list of high-performance computers (HPC) in the world, it becomes clear that computing power is not the only number that has been growing in the last three decades. The amount of power required to operate such massive computing machines has been steadily increasing, earning HPC users a higher than usual carbon footprint. While the problem is well known in academia, the exact energy requirements of hardware, software and how to optimize it are hard to quantify. To tackle this issue, we need tools to understand the software and its relationship with power consumption in today's high performance computers. With that in mind, we present perun, a Python package and command line interface to measure energy consumption based on hardware performance counters and selected physical measurement sensors. This enables accurate energy measurements on various scales of computing, from a single laptop to an MPI-distributed HPC application. We include an analysis of the discrepancies between these sensor readings and hardware performance counters, with particular focus on the power draw of the usually overlooked non-compute components such as memory. One of our major insights is their significant share of the total energy consumption. We have equally analyzed the runtime and energy overhead perun generates when monitoring common HPC applications, and found it to be minimal. Finally, an analysis on the accuracy of different measuring methodologies when applied at large scales is presented.
Microseismic detection events are critical in monitoring subsurface activities, including hydraulic fracturing, enhanced oil recovery, carbon dioxide, or natural gas geological storage and reservoir characterization, ...
详细信息
ISBN:
(纸本)9783031669644;9783031669651
Microseismic detection events are critical in monitoring subsurface activities, including hydraulic fracturing, enhanced oil recovery, carbon dioxide, or natural gas geological storage and reservoir characterization, to guarantee safe and efficient energy extraction. This presents significant challenges as it generates large amounts of data. Despite the recent machine learning techniques being increasingly integrated into fiber-optic distributed acoustic sensor (DAS) systems to enhance their intelligent recognition capabilities, there is still a need to solve this. In some research, it has been observed that computational speed is time-consuming and overfitting, thus necessitating a more extensive investigation of this analysis. This study proposes a novel approach using DAS data to enhance microseismic event detection's precision, overfitting issue, and interpretability. This approach utilizes a specifically designed neural network architecture. The deep learning approach is highly effective for the real-time management of the substantial amounts of data recorded by DAS equipment. Three phases of research methodology are proposed. The contribution of this research is that spiking neural network architecture for microseismic detection will bring advancements in microseismic monitoring.
With the development of deep learning, DNN models have become more complex. Large-scale model parameters enhance the level of AI by improving the accuracy of DNN models. However, they also present more severe challeng...
详细信息
ISBN:
(纸本)9789819708338;9789819708345
With the development of deep learning, DNN models have become more complex. Large-scale model parameters enhance the level of AI by improving the accuracy of DNN models. However, they also present more severe challenges to the hardware training platform for training a large model needs a lot of computing and memory resources, which can easily exceed the capacity of an accelerator. In addition, with the increasing demand for the accuracy of DNN models in academia and industry, the number of training iterations is also skyrocketing. In these backgrounds, more accelerators are integrated on a hierarchical platform to conduct distributed training. In distributed training platforms, the computation of the DNN model and the communication of the intermediate parameters are handled by different hardware modules, so their degree of parallelism profoundly affects the training speed. In this work, based on the widely used hierarchical Torus-Ring training platform and the Ring All-Reduce collective communication algorithm, we improve the speed of distributed training by optimizing the parallelism of communication and computation. Specifically, based on the analysis of the distributed training process, we schedule the computation and communication so that they execute simultaneously as much as possible. Finally, for data parallelism and model parallelism, we reduce the communication exposure time and the computation exposure time, respectively. Compared with the previous work, the training speed (including 5 training iterations) of the Resnet50 model and the Transformer model is increased by 23.77%-25.64% and 11.66%-12.83%.
SDODI, which stands for Sustainable Development Optimization using distributed Intelligence, advances sustainable development. SDODI is the first to examine environmental, social, and economic challenges together usin...
详细信息
With the evolutionary era of IoT and its implied changes in area of computing, distributed systems have resurfaced as Cloud computing, Grid computing, Utility computing, Cluster computing, Edge computing and Fog Compu...
详细信息
parallel/distributed particle filters estimate the states of dynamic systems by using Bayesian interference and stochastic sampling techniques with multiple processing units (PUs). The sampling procedure and the resam...
详细信息
Sparse LU factorization is a critical kernel in scientific computing and engineering applications. A better nonzero pattern of sparse matrixes can accelerate LU factorization by reordering. Traditionally it's diff...
详细信息
暂无评论