Distribution networks are currently hosting a major share of the energy transition and require the development of new management mechanisms. Their tuning requires numerous simulations to get closer to the optimal perf...
详细信息
ISBN:
(纸本)9798350390438;9798350390421
Distribution networks are currently hosting a major share of the energy transition and require the development of new management mechanisms. Their tuning requires numerous simulations to get closer to the optimal performance set by optimal power flow results. In radial networks, this problem admits specific resolutions. However, these algorithms are still hampered by prohibitive computing times for large-scale problems, which is typically the case for a distribution network. Specific implementations are, therefore, crucial to make the most of heterogeneous CPU GPU computing architectures. This contribution focuses on two distributed OPF algorithms in radial networks. The speed-up by a factor of 3 is observed, particularly on the largest systems. Nevertheless, in several study cases, GPU methods failed to converge due to the lack of numerical stability of the algorithms.
MapReduce is a programming framework designed for processing and analyzing large volumes of data in a distributedcomputing environment. Despite its capabilities, it faces challenges due to silent data corruption duri...
详细信息
ISBN:
(数字)9789819708628
ISBN:
(纸本)9789819708611;9789819708628
MapReduce is a programming framework designed for processing and analyzing large volumes of data in a distributedcomputing environment. Despite its capabilities, it faces challenges due to silent data corruption during task execution, which can yield inaccurate results. Ensuring fault tolerance in the MapReduce framework while minimizing communication overhead presents considerable challenges. This study presents CDCFT (Coded distributedcomputing Fault Tolerance), a novel approach to fault tolerance within the MapReduce paradigm, combining the strengths of TMR (Triple Modular Redundancy) and CDC (Coded distributedcomputing). By leveraging task-level TMR and voting mechanisms, CDCFT robustly defends against silent data corruption. To further optimize, CDCFT employs intra-group broadcasts for relaying intermediate messages and has a finely-tuned node grouping combined with a strategic data and task allocation procedure. Through rigorous theoretical analysis, we establish that CDCFT's communication overhead during the Shuffle Stage is notably less than traditional CDC methods that rely on triple modular redundancy. Experimental results showcase the efficacy of CDCFT, signifying a substantial reduction in the overall communication overhead and execution time compared to the conventional fault-tolerant methods.
In order to understand cardiac arrhythmia, computer models for electrophysiology are essential. In the EuroHPC MicroCARD project, we adapt the current models and leverage modern computing resources to model diseased h...
详细信息
Accurate and efficient transient analysis of power grids (PGs) poses a large challenge of computation for nowadays integrated circuit design. In this work, we propose to leverage the public cloud computing to do PG tr...
详细信息
ISBN:
(纸本)9798350393545
Accurate and efficient transient analysis of power grids (PGs) poses a large challenge of computation for nowadays integrated circuit design. In this work, we propose to leverage the public cloud computing to do PG transient analysis while preserving security. A multi-level distributedparallel LU factorization and forward/backward substitution approach based on nested dissection is then proposed to guarantee accuracy and robustness. Experimental results show that the proposed algorithm can achieve an average 2.06X speedup over NICSLU and 2.85X over conventional domain decomposition method based parallel approach. And, it exhibits good scalability with up to 6.0X parallel speedup on large-scale PGs with 4 cloud computer nodes.
With the demand for sufficient and quality power on the rise, distributed Generation (DG) is found to be a good alternative. DGs combined with clean Energy Sources as wind, hydro, solar, hybrid systems will have the p...
详细信息
Click logs collect user interaction with information retrieval systems (e.g., search engines). Clicks therefore become implicit feedback for such systems, and are further used to train click models, which in turn impr...
详细信息
ISBN:
(纸本)9783031506833;9783031506840
Click logs collect user interaction with information retrieval systems (e.g., search engines). Clicks therefore become implicit feedback for such systems, and are further used to train click models, which in turn improve the quality of search and recommendations results. Click models based on expectation maximization (EM) are known to be effective and robust against various biases. Training EM-based models is challenging due to the size of click logs, and can take many hours when using sequential tools like PyClick. Alternatives, such as ParClick, employ parallelism and show significant speedup. However, ParClick only works on single-node multi-core systems. To further scale up and out, in this work we introduce MassiveClicks, the first massively parallel, distributed, multi-GPU framework for EM-based click-models training. MassiveClicks relies on efficient GPU kernels, balanced data-partitioning policies, and distributedcomputing to improve the performance of EM-based model training, outperforming ParClick by orders of magnitude when using GPUs and/or multiple nodes. Additionally, the framework supports heterogeneous GPU architectures, variable numbers of GPUs per node, allows for multi-node multi-core CPU-based training when no GPUs are available.
The proceedings contain 79 papers. The topics discussed include: designing a card game for computer science instructors to evaluate students’ parallel and distributedcomputing knowledge;a hands-on approach to teachi...
ISBN:
(纸本)9798331509118
The proceedings contain 79 papers. The topics discussed include: designing a card game for computer science instructors to evaluate students’ parallel and distributedcomputing knowledge;a hands-on approach to teaching parallel and heterogeneous computing;efficient feature extraction for vision transformer model using a custom CNN accelerator;object detection for autonomous vehicles in adverse weather and varying lighting conditions using a hybrid YOLO approach;predictive modeling of performance variability in HPC applications;simulation-driven design of large-scale systems architecture;performance optimization on cxl products using in-house modeling and simulation toolchain;and hide mastermind using an intermediate connection on social network
This paper explores integrating blockchain technology into multi-agent systems (MAS) to enhance distributed node resource optimization. Key challenges addressed include task decision-making, task allocation, and resou...
详细信息
ISBN:
(纸本)9798350351606;9798350351590
This paper explores integrating blockchain technology into multi-agent systems (MAS) to enhance distributed node resource optimization. Key challenges addressed include task decision-making, task allocation, and resource scheduling, with a focus on minimizing energy consumption and latency. Blockchain ensures secure, efficient coordination among nodes, mitigating issues like data privacy leaks and system failures. The study also leverages federated learning for secure decentralized machine learning model training. Simulation results demonstrate the enhanced performance, security, and scalability of MAS with blockchain, paving the way for more efficient distributedcomputing environments.
Mosaic Flow is a novel domain decomposition method designed to scale physics-informed neural PDE solvers to large domains. Its unique approach leverages pre-trained networks on small domains to solve partial different...
详细信息
ISBN:
(纸本)9798400701092;9798350376630
Mosaic Flow is a novel domain decomposition method designed to scale physics-informed neural PDE solvers to large domains. Its unique approach leverages pre-trained networks on small domains to solve partial differential equations on large domains purely through inference, resulting in high reusability. This paper presents an end-to-end parallelization of Mosaic Flow, combining data parallel training and domain parallelism for inference on large-scale problems. By optimizing the network architecture and data parallel training, we significantly reduce the training time for learning the Laplacian operator to minutes on 32 GPUs. Moreover, our distributed domain decomposition algorithm enables scalable inferences for solving the Laplace equation on domains 4096x larger than the training domain, demonstrating strong scaling while maintaining accuracy on 32 GPUs. The reusability of Mosaic Flow, combined with the improved performance achieved through the distributed-memory algorithms, makes it a promising tool for modeling complex physical phenomena and accelerating scientific discovery.
Predicting the performance of parallel applications at scale is a challenging problem. We have developed a performance prediction model for structured grid-based scientific applications for High Performance computing ...
详细信息
暂无评论