As the industry drives towards more capable ML, workloads are rapidly evolving and the need for performance is nearly unlimited. Performance has vastly outstripped the pace of Moore's Law as measured by MLPerf™ th...
详细信息
ISBN:
(数字)9781665497473
ISBN:
(纸本)9781665497480
As the industry drives towards more capable ML, workloads are rapidly evolving and the need for performance is nearly unlimited. Performance has vastly outstripped the pace of Moore's Law as measured by MLPerf™ through software/hardware co-design. This talk will discuss challenges in benchmarking ML Training and explore the design space and also identify opportunities in future ML systems.
Among the many software technologies around the world,there is no parallel to cloud computing(CC) .Businesses use cloud computing for hosting *** cloud computing, tasks are distributed across a group of resources whic...
详细信息
ISBN:
(数字)9798331522100
ISBN:
(纸本)9798331522117
Among the many software technologies around the world,there is no parallel to cloud computing(CC) .Businesses use cloud computing for hosting *** cloud computing, tasks are distributed across a group of resources which when executed, help reduce time needed to finish them *** optimization refers to the process of minimizing the time needed to accomplish tasks. These group of tasks are present in a domain related to computers. In CC, the goal of this optimization is reducing the total time needed to execute the group of tasks when distributing and controlling virtual machines or resources. This paper aims to obtain optimization in scheduling tasks in CC by utilizing genetic *** methods whose basis are natural selection and genetics are called Genetic algorithms. They combine the idea of "survival of the fittest" in string-based representations with a mix of structured and random information exchange, forming a search approach that includes a level of human-like creativity. .After the application of genetic algorithms to task scheduling problem, the minimum fitness per generation is seen in the graph after two generation of stability.
With the flourishing development of the Internet of Things (IoT) era, Edge Computing (EC) technology has garnered significant attention along with advancements in communication and IoT technologies. Effectively harnes...
With the flourishing development of the Internet of Things (IoT) era, Edge Computing (EC) technology has garnered significant attention along with advancements in communication and IoT technologies. Effectively harnessing the computational resources in the edge environment has become a paramount concern. Accurate workload prediction is considered fundamental to optimizing the utilization of limited edge resources. However, most existing edge cloud load prediction approaches overlook the correlations among edge sites. Moreover, for cloud-native applications, user requests are typically handled by multiple containers. By evolving the work behavior of workload groups, it is possible to achieve a shift from focusing on individual containers to the collective, resulting in higher predictive accuracy. In this paper, the behavior of workload groups is analyzed through the reference to both static and dynamic container information. Leveraging the input data constructed based on the behavior of workload groups, an improved Sample Convolution and Interaction Network (SCINet) is employed for early multi-step prediction of edge container loads. Experimental validation on an edge cloud load dataset demonstrates the effectiveness of the proposed approach. Experimental results show that the proposed method can effectively improve the accuracy of edge cloud load prediction.
In high-throughput intelligent computing scenarios, multi-device parallelism strategies based on data parallelism or pipeline parallelism have been extensively utilized to accelerate large deep neural network model in...
In high-throughput intelligent computing scenarios, multi-device parallelism strategies based on data parallelism or pipeline parallelism have been extensively utilized to accelerate large deep neural network model inference. Data parallelism offers nearly linear improvement in inference speed, but it is limited by the memory capacity of a single device which constrains the model size. On the other hand, pipeline parallelism can support larger models, but the total communication of the activations among devices is high, which limits the improvement of the inference speed. To address the demand for efficient model inference in high-throughput heterogeneous scenarios, we proposes a hybrid parallelism strategy that combines data parallelism and pipeline parallelism. The strategy involves grouping heterogeneous device clusters and then employing inter-group data parallelism along with intra-group pipeline parallelism. Moreover, we propose an algorithm to find an optimal hybrid parallel inference strategy with maximum throughput. The control variables of the strategy includes the number of groups, group-device assignments and model partition ratios. Our experimental evaluation demonstrates that compared to PipeEdge, a pipeline parallel inference framework for heterogeneous cluster, our strategy can achieve 1.7× −3.4× acceleration in an 8-device heterogeneous cluster without loss of accuracy.
Due to the heterogeneity of workloads and the randomness and complexity of hybrid scheduling environment, How to address the challenge of minimizing data center running cost while ensuring workloads SLA has emerged as...
Due to the heterogeneity of workloads and the randomness and complexity of hybrid scheduling environment, How to address the challenge of minimizing data center running cost while ensuring workloads SLA has emerged as a significant research problem. To address this, we propose an Optimal Deep Reinforcement Learning Model for Running Cost Optimization in Hybrid Environments (ODRL). Firstly, we propose Running Cost Model (RCM) that analyzes data center running costs from two perspectives based on different running patterns of the workload and node affinity constraints, minimizing computing node utilization and resource running costs. Subsequently, we propose a Priority-Aware Scheduling Algorithm based on Deep Reinforcement Learning(PASD) that use AHP-TOPSIS to quantify workload characterization, get real-time feedback for environment and update historical experience through prioritized experience replay. Finally, extensive experimental results validate the effectiveness of the proposed model. In comparison to Kube-Scheduler, ODRL achieves a significant 15% improvement in reducing data center running cost. Additionally, there is a 7.8% increase in CPU utilization and a 16.7% increase in memory utilization.
The proceedings contain 38 papers. The special focus in this conference is on Sustainable Energy and Power engineering. The topics include: Selection of the Appropriate Intelligence Level of Regulators of Thermal Powe...
ISBN:
(纸本)9789811693755
The proceedings contain 38 papers. The special focus in this conference is on Sustainable Energy and Power engineering. The topics include: Selection of the Appropriate Intelligence Level of Regulators of Thermal Power Plants Technological Processes;study of the Properties of Fuel Gas in Gas Turbine Plants Depending on Its Composition;multichannel Majority System for Detection and Prevention of Emergency Modes of Gas Pumping Unit Filters;verification of Computer Flow Simulation in Confuser and Diffuser Channels;Pressurized Heat Recovery Steam Generator Design for CCGT with Gas Turbine GT-25PA and Steam Turbine T-100;numerical Modeling of the Influence of the Atmospheric Meteorological State on the Efficiency of the Functioning of Solar Thermal and Power Plants;numerical Study of the Fluid Flow in a Passive Tangential Vortex Tube;Heat Transfer During Fuel Oil Flow in Storage Tanks and Heaters of the Reserve Facility of TPP;On the Importance of Applying RCM Technology to Passive Components of Russian NPPs;the Technique for Estimation of Thermal Power Equipment Remaining Life Based on Service Life Index;fault Isolation in Digital Instruments and Devices Used in Power-engineeringsystems;scientific and Technical Basis for Using External Heat Supply to a Turbine Unit in the Classic Brighton Cycle;continuous Steelmaking Unit of Bubbling Type;efficiency of Peat Combustion in a Low Capacity Boiler: Analysis of Peat Reserves in the Arkhangelsk Region and Efficiency of Its Energy Use;joint Use of distributed Ledger Technology and Centralized software for Storing Power Equipment Reliability Data.
Traditionally, Computational Fluid Dynamics (CFD) software uses Message Passing Interface (MPI) to handle the parallelism over distributed memory systems. For a new developer, such as a student or a new employee, the ...
详细信息
ISBN:
(纸本)9781728174457
Traditionally, Computational Fluid Dynamics (CFD) software uses Message Passing Interface (MPI) to handle the parallelism over distributed memory systems. For a new developer, such as a student or a new employee, the barrier of entry can be high and more training is required for each particular software package, which slows down the research process on actual science. The Chapel programming language offers an interesting alternative for research and development of CFD applications. In this paper, the developments of two CFD applications are presented: the first one as an experiment by re-writing a 2D structured flow solver and the second one as writing from scratch a research 3D unstructured multi-physics simulation software. Details are given on both applications with emphasis on the Chapel features which were used positively in the code design, in particular to improve flexibility and extend to distributed memory. Some performance pitfalls are discussed with solutions to avoid them. The performance of the unstructured software is then studied and compared to a traditional open-source CFD software package programmed in C++ with MPI for communication (SU2). The results show that our Chapel implementation achieves performances similar to other CFD software written in C and C++, thus confirming that Chapel is a viable language for high-performance CFD applications.
Forecasting crowd movement accurately across an urban area is crucial for efficient traffic control and ensuring public security. Current methods involve transforming the city’s roadmap into a grid-based map, enablin...
Forecasting crowd movement accurately across an urban area is crucial for efficient traffic control and ensuring public security. Current methods involve transforming the city’s roadmap into a grid-based map, enabling Convolutional Neural Networks (CNNs) or Graph Convolutional Networks (GCNs) to capture spatio-temporal relationships efficiently. However, this approach overlooks the connection between irregularly shaped real-world areas, which can be categorized into various functional zones. In this article, we introduce a novel approach for predicting urban crowd flow named STHGN, which utilizes hypergraph convolutional networks. By constructing 3-level hypergraphs from irregular areas and adopting Hyper-GCN, we capture mobility among irregular regions. We construct the hypergraphs based on hour, day, and week, simultaneously using gated-based mechanisms to fuse various embeddings. We evaluate the efficacy of our model by contrasting it with 11 other approaches, including the most sophisticated STGs. After conducting numerous experiments, we find that STHGN outperforms these methods with higher accuracy, resulting in a reduction of approximately 6-9% in mean absolute error (MAE) for crowd flow prediction.
The paper describes the setup and operation of a control education lab which is used in different courses at both the undergraduate and graduate levels in electrical engineering and building systems programs. Its main...
详细信息
The paper describes the setup and operation of a control education lab which is used in different courses at both the undergraduate and graduate levels in electrical engineering and building systems programs. Its main characteristic is the equipment with several industrial-scale distributed Control systems (DCS) and the use of commercial software for its design and operation. Although procurement and maintenance costs are relatively high, the use of the lab across different study programs and courses results in high lab utilization and reasonable cost per individual lab. Some of the student projects are described in detail, among them the design of a DCS integrated Model Predictive Control (MPC) solution. The lab setup reduces the necessary system-specific training time for students and leads to a more efficient lab organization. Students appreciate the use of industry-related installation since it helps them to carry out a smooth transition from academia to industry. Copyright (C) 2022 The Authors.
In Internet of Things (IoT) employing centralized machine learning, security is a major concern due to the heterogeneity of end devices. Decentralized machine learning (DML) with blockchain is a potential solution. Ho...
详细信息
ISBN:
(纸本)9781665468244
In Internet of Things (IoT) employing centralized machine learning, security is a major concern due to the heterogeneity of end devices. Decentralized machine learning (DML) with blockchain is a potential solution. However, blockchain with proof-of-work (PoW) consensus mechanism wastes computing resources and adds latency to DML. Computing resources can be utilized more efficiently with proof-of-useful-work (uPoW), which secures transactions by solving real-world problems. We propose a novel uPoW method that exploits PoW mining to accelerate DML through a task scheduling framework for multi-access edge computing (MEC) systems. To provide a good quality-of-service for the system, we minimize the latency by solving a multi-way number partitioning problem in the extended form. A novel uPoW-based mechanism is proposed to schedule DML tasks among MEC servers effectively. Simulation results show that our proposed blockchain strategies accelerate DML significantly compared with benchmarks.
暂无评论