Confidential computing on GPUs, like NVIDIA H100, mitigates the security risks of outsourced Large Language Models (LLMs) by implementing strong isolation and data encryption. Nonetheless, this encryption incurs a sig...
详细信息
ISBN:
(纸本)9798400706981
Confidential computing on GPUs, like NVIDIA H100, mitigates the security risks of outsourced Large Language Models (LLMs) by implementing strong isolation and data encryption. Nonetheless, this encryption incurs a significant performance overhead, reaching up to 52.8% and 88.2% throughput drop when serving OPT-30B and OPT-66B, respectively. To address this challenge, we introduce PipeLLM, a user-transparent runtime system. PipeLLM removes the overhead by overlapping the encryption and GPU computation through pipelining-an idea inspired by the CPU instruction pipelining-thereby effectively concealing the latency increase caused by encryption. The primary technical challenge is that, unlike CPUs, the encryption module lacks prior knowledge of the specific data needing encryption until it is requested by the GPUs. To this end, we propose speculative pipelined encryption to predict the data requiring encryption by analyzing the serving patterns of LLMs. Further, we have developed an efficient, low-cost pipeline relinquishing approach for instances of incorrect predictions. Our experiments show that compared with vanilla systems without confidential computing (e.g., vLLM, PEFT, and FlexGen), PipeLLM incurs modest overhead (<19.6% in throughput) across various LLM sizes, from 13B to 175B. PipeLLM's source code is available at https://***/SJTU-IPADS/PipeLLM.
In the construction process of new power systems, the concept of power cluster control has emerged with the integration of large-scale distributed new energy, and regional autonomy is an effective means of distributed...
详细信息
The proliferation of distributed energy resources has heightened the interactions between transmission and distribution (T&D) systems, necessitating novel analyses for the reliable operation and planning of interc...
详细信息
With the strategy of "Emission Peak" and "Carbon Neutrality" proposed the transformation and reform of the energy system are imminent Building a new power system mainly composed of renewable power ...
详细信息
ISBN:
(数字)9783030965044
ISBN:
(纸本)9783030965044;9783030965037
With the strategy of "Emission Peak" and "Carbon Neutrality" proposed the transformation and reform of the energy system are imminent Building a new power system mainly composed of renewable power is one of the most important measures to achieve the dual goals on carbon. The stability and reliability of the new power system have become the focus of attention. Therefore, it is necessary to minimize the impact of distributed generation integration on the power system and find out the optimal configuration model. By studying the working mechanism of distributed generation such as wind power generation, photovoltaic power generation and energy storage power supply, this paper studies and summarizes the working strategy and operation mode during distributed power grid-connected and islanded operation, and further designs the distributed generation integration model based on Colored Petri Net for optimization. By deducing the configuration optimization of the grid connection model from the grid connection access point, network loss and economy, the analysis logic of the optimal grid connection access point, optimal grid connection capacity and optimal configuration is finally output. The method deduction shows that the grid connection model provides a new grid connection optimization method for distributed generation grid connection, which makes up for the relative instability of traditional grid connection in power system.
With the advancement of power systems, grid operations monitoring has become more complex. To ensure the sustainable operation of the grid, it is important to quickly and dynamically extract a small amount of data fro...
详细信息
ISBN:
(纸本)9798350347456
With the advancement of power systems, grid operations monitoring has become more complex. To ensure the sustainable operation of the grid, it is important to quickly and dynamically extract a small amount of data from a large amount of real-time data, closely related to grid security, and accurately identify system security risks in a timely manner. In this regard, this paper proposes an artificially intelligent sustainable management system for the grid, which has achieved remarkable results. The main contribution of this paper is the development of a risk identification model for microgrids using artificial intelligence and the DEMATEL method. Firstly, the paper identifies the risk factors involved in the microgrid from four aspects, including the generation side, distribution side, demand side, and human factors, through a comprehensive literature review, expert survey, and brainstorming. Secondly, an artificial intelligence and DEMATEL-based microgrid risk factor identification method is employed to clarify the importance and perturbation relationship of each risk factor of the microgrid. Finally, the paper classifies all factors into two categories, cause factors, and effect factors, and ranks the importance of each factor. To further demonstrate the effectiveness of the proposed model, a wind power prediction algorithm based on data mining technology and an improved SVM algorithm, and a PV power prediction algorithm based on a deep neural network are established. After comparing and analyzing the performance of the constructed algorithms with other algorithms, the DBN prediction model is proposed, which has an absolute error probability of 62.9% within 1%, surpassing the other algorithms and meeting the engineering needs. Moreover, the paper proposes risk control measures on the power generation side, which can significantly reduce the risks involved in power generation. In summary, the paper proposes an artificially intelligent sustainable management system
With in the trend of intelligence, task cooperation has been emerging as one main feature for Multiple Robot Systems (MRS) and applications, typically Multi-UAV (MUAV) system. For such swarm cooperative systems, how t...
详细信息
High Performance computing (HPC) has become an indispensable tool in various scientific and engineering domains, demanding efficient infrastructure to support computationally intensive tasks. Cloud computing platforms...
详细信息
ISBN:
(数字)9798331521349
ISBN:
(纸本)9798331521356
High Performance computing (HPC) has become an indispensable tool in various scientific and engineering domains, demanding efficient infrastructure to support computationally intensive tasks. Cloud computing platforms, such as Amazon Web Services (AWS), have emerged as a viable solution for provisioning HPC resources. In this work, we explore the performance of HPC workloads running on AWS, utilizing Elastic Fabric Adapter (EFA) interconnect technology and Intel Xeon Scalable Processors. The investigation focuses on Message Passing Interface (MPI) applications, evaluating the system's efficiency, scalability, and the benefits of EFA integration. We present experimental results and insights into optimizing HPC workloads in the AWS environment, contributing to the ongoing discussion on achieving high-performance computational capabilities in the cloud.
The optimization of job offloading procedures in modern vehicular networks is a problem of utmost importance. In this regard, this paper proposes MANTRA, a distributed framework based on multi-player multi-armed bandi...
详细信息
ISBN:
(纸本)9798350399806
The optimization of job offloading procedures in modern vehicular networks is a problem of utmost importance. In this regard, this paper proposes MANTRA, a distributed framework based on multi-player multi-armed bandit (MPMAB) algorithms for latency- and energy-aware job offloading in vehicular networks. The main goal of MANTRA is to support procedures of job offloading in green vehicular networks to achieve a target tradeoff between energy consumption and job processing latency. In particular, MANTRA is intended to run on so-called MEC-in-a-box (M-Box) devices, portable batterypowered Road Side Units (RSUs) specifically designed to work without mobile connectivity and of a fixed power grid. To demonstrate MANTRA effectiveness, we model the vehicular network using the queueing theory for M/M/m/K systems. We run an extensive evaluation campaign and compare MANTRA with several baselines, including a centralized, oracle-based approach. In such a way, we demonstrate how MANTRA outperforms the baselines and quickly converges to the performance of the centralized approach in a fully-distributed way in terms of job processing latency and network outage probability.
Regenerating codes are new network codes proposed to reduce the data required for fault repair, which can improve the recovery efficiency of faulty nodes in data storage systems. However, unlike Reed-Solomon code, whi...
详细信息
With the development of interconnection of massive terminals, high maintenance of equipment data and complexity of network structure in distribution system, the traditional information processing mode based on single ...
详细信息
With the development of interconnection of massive terminals, high maintenance of equipment data and complexity of network structure in distribution system, the traditional information processing mode based on single cloud master station has some disadvantages, such as high data transmission delay, high congestion, information far away from equipment and slow queuing processing. Edge computing can make local fast analysis and decision for data in the near terminal device side, which can effectively make up for the deficiency of cloud computing. As the cornerstone of edge computing, edge control processing node is the first premise to realize fast response control of cyber physical distribution system (CPDS). In this paper, a new method of edge computing node deployment is proposed for the distributed edge computing requirements of cyber physical distribution system, considering the cyber stability and power stability. The community theory based on distribution network power flow and line impedance is used to divide the edge computing autonomous region, and the calculation model of cyber stability and power stability of CPDS network is established, then the optimal deployment of CPDS edge computing nodes is solved by the hybrid entropy calculation of cyber and power. Finally, an improved standard example is used for simulation analysis. The simulation results show that the ECN deployment scheme proposed in this paper can effectively reduce network congestion, improve network response speed and realize optimal control of distribution network. (C) 2022 The Authors. Published by Elsevier Ltd.
暂无评论