Today's supercomputers offer massive computation resources to execute a large number of user jobs. Effectively managing such large-scale hardware parallelism and workloads is essential for supercomputers. However,...
详细信息
ISBN:
(纸本)9781665454452
Today's supercomputers offer massive computation resources to execute a large number of user jobs. Effectively managing such large-scale hardware parallelism and workloads is essential for supercomputers. However, existing HPC resource management (RM) systems fail to capitalize on the hardware parallelism by following a centralized design used decades ago. They give poor scalability and inefficient performance on today's supercomputers, which will worsen in exascale computing. We present ESlurm, a better RM for supercomputers. As a departure from existing HPC RMs, ESlurm implements a distributed communication structure. It employs a new communication tree strategy and uses job runtime estimation to improve communications and job scheduling efficiency. ESlurm is deployed into production in a real supercomputer. We evaluate ESlurm on up to 20K nodes. Compared to state-of-the-art RM solutions, ESlurm exhibits better scalability, significantly reducing the resource usage of master nodes and improving data transfer and job scheduling efficiency by a large margin.
Checkpointing is basically a technique of fault tolerance for various computing systems. In Checkpointing we save the state of a process during execution periodically, so that applications can restart from that point ...
详细信息
The proceedings contain 55 papers. The special focus in this conference is on Cloud computing. The topics include: Rendering of Three-Dimensional Cloud Based on Cloud computing;distributed Stochastic Alternating Direc...
ISBN:
(纸本)9783030485122
The proceedings contain 55 papers. The special focus in this conference is on Cloud computing. The topics include: Rendering of Three-Dimensional Cloud Based on Cloud computing;distributed Stochastic Alternating Direction Method of Multipliers for Big Data Classification;personalized Recommendation Algorithm Considering Time Sensitivity;cloud-Based Master Data Platform for Smart Manufacturing Process;a Semi-supervised Classification Method for Hyperspectral Images by Triple Classifiers with Data Editing and Deep Learning;A Survey of Image Super Resolution Based on CNN;design and Development of an Intelligent Semantic Recommendation System for Websites;a Lightweight Neural Network Combining Dilated Convolution and Depthwise Separable Convolution;resource Allocation Algorithms of Vehicle Networks with Stackelberg Game;research on Coordination Control Theory of Greenhouse Cluster Based on Cloud computing;a Multi-objective Computation Offloading Method in Multi-cloudlet Environment;anomalous Taxi Route Detection System Based on Cloud Services;collaborative Recommendation Method Based on Knowledge Graph for Cloud Services;efficient Multi-user Computation Scheduling Strategy Based on Clustering for Mobile-Edge computing;Grazing Trajectory Statistics and Visualization Platform Based on Cloud GIS;Cloud-Based AGV Control System;a parallel Drone Image Mosaic Method Based on Apache Spark;cycleSafe: Safe Route Planning for Urban Cyclists;prediction of Future Appearances via Convolutional Recurrent Neural Networks Based on Image Time Series in Cloud computing;video Knowledge Discovery Based on Convolutional Neural Network;time-Varying Water Quality Analysis with Semantical Mining Technology;a Survey of QoS Optimization and Energy Saving in Cloud, Edge and IoT;data-Driven Fast Real-Time Flood Forecasting Model for Processing Concept Drift;intelligent System Security Event Description Method.
The HPC community shows a keen interest in creating diversity in the CPU ecosystem. The advent of Arm-based processors provides an alternative to the existing HPC ecosystem, which is primarily dominated by x86 process...
详细信息
ISBN:
(纸本)9781728166773
The HPC community shows a keen interest in creating diversity in the CPU ecosystem. The advent of Arm-based processors provides an alternative to the existing HPC ecosystem, which is primarily dominated by x86 processors. In this paper, we port an Asynchronous Many-Task runtime system based on the ParalleX model, i.e., High Performance ParalleX (HPX), and evaluate it on the Arm ecosystem with a suite of benchmarks. We wrote these benchmarks with an emphasis on vectorization and distributed scaling. We present the performance results on a variety of Arm processors and compare it with their x86 brethren from Intel. We show that the results obtained are equally good or better than their x86 brethren. Finally, we also discuss a few drawbacks of the present Arm ecosystem.
Cloud resources can be dynamically provisioned according to application-specific requirements and are payed on a per-use basis. This gives rise to a new concept for parallel processing: Elastic parallel computations. ...
详细信息
ISBN:
(数字)9783030494322
ISBN:
(纸本)9783030494315;9783030494322
Cloud resources can be dynamically provisioned according to application-specific requirements and are payed on a per-use basis. This gives rise to a new concept for parallel processing: Elastic parallel computations. However, it is still an open research question to which extent parallel applications can benefit from elastic scaling, which requires resource adaptation at runtime and corresponding coordination mechanisms. In this work, we analyze how to address these system-level challenges in the context of developing and operating elastic parallel tree search applications. Based on our findings, we discuss the design and implementation of TASKWORK, a cloud-aware runtime system specifically designed for elastic parallel tree search, which enables the implementation of elastic applications by means of higher-level development frameworks. We show how to implement an elastic parallel branch-and-bound application based on an exemplary development framework and report on our experimental evaluation that also considers several benchmarks for parallel tree search.
Kyber, an IND-CCA-secure key encapsulation mechanism (KEM) based on the MLWE problem, has been shortlisted for the third round evaluation of the NIST Post-Quantum Cryptography Standardization. In this paper, we explor...
详细信息
Python as programming language is increasingly gaining importance, especially in data science, scientific, and parallel programming. It is faster and easier to learn than classical programming languages such as C. How...
详细信息
ISBN:
(数字)9781728165820
ISBN:
(纸本)9781728165820
Python as programming language is increasingly gaining importance, especially in data science, scientific, and parallel programming. It is faster and easier to learn than classical programming languages such as C. However, usability often comes at the cost of performance and applications written in Python are considered to be much slower than applications written in C or FORTRAN. Further, it does not allow the usage of CPUs-besides of pre-compiled libraries. However, the Numba package promises performance similar to C code for compute intensive parts of a Python application and it supports CUDA, which allows the use of GPUs inside a Python application. In this paper we compare the perforinance of iVumba-CUDA and C-CUDA for different kinds of applications. For compute intensive benchmarks, the performance of the Numba version only reaches between 50% and 85% performance of the CCUDA version, despite the reduction operation, where the Numba version outperforms CUDA. Analyzing the PTX code and CUDA performance counters revealed that index-calculation is one 'uniting factor in Numba. Another problem is the type interference for single precision computations, as sonic values are computed in double precision. By optimizing this within the Numba package, the performance of Numba improves. However, C-CUDA applications still outperform the Numba versions. Further analysis with the CloverLeav Mini App shows that Numba performance further decreases for applications with multiple different compute kernels. The non-CPU part slows down these applications, due to the slow Python interpreter. This leads to a worse GPU utilization.
Multi-Agent Systems (MAS) are naturally good candidates for large-scale parallel simulations. However, implementing MAS simulations for distributed memory architectures, such as High Performance computing clusters, is...
详细信息
Multi-Agent Systems (MAS) are naturally good candidates for large-scale parallel simulations. However, implementing MAS simulations for distributed memory architectures, such as High Performance computing clusters, is still complex for non-experts. In this article we present the principle of a Dynamic distributed Graph structure, that enables the native distribution of MAS simulations. Most of the distribution related issues such as dynamic load-balancing, time synchronization and data migration across processes can be completely automated and abstracted for the user, who can safely design distribution independent MAS models. The major interest of our contribution is the transparent management of concurrent read / write requests across distant processes, a significant feature not provided by surveyed platforms. We also present FPMAS, an open source C++ implementation of a distributed Multi-Agent System Simulation platform based on the distributed Graph structure.
distributed energy resources (DERs) are attractive because of their flexibility and demand response capabilities;however, there are numerous challenges concerning their integration into the electric grid - including l...
详细信息
ISBN:
(纸本)9781728161273
distributed energy resources (DERs) are attractive because of their flexibility and demand response capabilities;however, there are numerous challenges concerning their integration into the electric grid - including lack of visibility and control, as well as misalignments in the interests of privately-owned DERs with respect to the collective interests of the grid. In order to coordinate the behavior of these DERs, we treat the grid as a multiagent system and propose a service-oriented broker architecture (SOBA). SOBA enables the behavior of privately owned DERs to be influenced by system operators through service requests, and autonomous peer discovery. We illustrate SOBA's features and motivate service requests through a scenario with a network of small-scale solar photovoltaics, inverters, and batteries.
暂无评论