Feature-only partition of large graph data in distributed Graph Neural Network (GNN) training offers advantages over commonly adopted graph structure partition, such as minimal graph preprocessing cost and elimination...
详细信息
ISBN:
(数字)9798350383508
ISBN:
(纸本)9798350383515
Feature-only partition of large graph data in distributed Graph Neural Network (GNN) training offers advantages over commonly adopted graph structure partition, such as minimal graph preprocessing cost and elimination of cross-worker subgraph sampling burdens. Nonetheless, performance bottleneck of GNN training with feature-only partitions still largely lies in the substantial communication overhead due to cross-worker feature fetching. To reduce the communication overhead and expedite distributed training, we first investigate and answer two key questions on convergence behaviors of GNN model in feature-partition based distribute GNN training: 1) As no worker holds a complete copy of each feature, can gradient exchange among workers compensate for the information loss due to incomplete local features? 2) If the answer to the first question is negative, is feature fetching in every training iteration of the GNN model necessary to ensure model convergence? Based on our theoretical findings on these questions, we derive an optimal communication plan that decides the frequency for feature fetching during the training process, taking into account bandwidth levels among workers and striking a balance between model loss and training time. Extensive evaluation demonstrates consistent results with our theoretical analysis, and the effectiveness of our proposed design.
For a control problem with multiple conflicting objectives, there exists a set of Pareto-optimal policies called the Pareto set instead of a single optimal policy. When a multi-objective control problem is continuous ...
详细信息
Edge computing moves cloud services closer to consumer Internet of Things (IoT) devices, reducing latency and bandwidth usage. This setup enables faster responses but also introduces new security challenges, particula...
详细信息
Path planning is the core of autonomous robot navigation, which helps the robot to find a collision-free path to the destination based on the environment information. Most current path planning methods only consider t...
详细信息
ISBN:
(数字)9798350308365
ISBN:
(纸本)9798350308372
Path planning is the core of autonomous robot navigation, which helps the robot to find a collision-free path to the destination based on the environment information. Most current path planning methods only consider the path length, but the optimal path may deviate from the shortest when considering other environmental factors such as uneven terrain or regions with varying traversal costs. Similarly, in scenarios prioritizing energy efficiency, a sole focus on path length may lead to suboptimal solutions. In this paper, an improved Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D) with adaptive weight vector, external archive, and constrained update strategy namely the MOEA/D-EAWA is proposed. This algorithm not only considers the path length but also four additional objectives such as smoothness, traveling time, terrain (elevation), and speed limit (expected delay). In addition, MOEA/D-EAWA is better suited for such many-objective path planning problem which has an irregular, discrete, and sparse Pareto front. The simulation results from 90 map instances demonstrate that the proposed method outperforms the existing approaches.
Federated learning (FL) is a nascent distributed learning paradigm to train a shared global model without violating users' privacy. FL has been shown to be vulnerable to various Byzantine attacks, where malicious ...
详细信息
Software vulnerabilities are a major cyber threat and it is important to detect them. One important approach to detecting vulnerabilities is to use deep learning while treating a program function as a whole, known as ...
详细信息
3D Anomaly Detection (AD) is a promising means of controlling the quality of manufactured products. However, existing methods typically require carefully training a task-specific model for each category independently,...
详细信息
Evolutionary reinforcement learning algorithms (ERLs), which combine evolutionary algorithms (EAs) with reinforcement learning (RL), have demonstrated significant success in enhancing RL performance. However, most ERL...
详细信息
ISBN:
(纸本)9798400704864
Evolutionary reinforcement learning algorithms (ERLs), which combine evolutionary algorithms (EAs) with reinforcement learning (RL), have demonstrated significant success in enhancing RL performance. However, most ERLs rely heavily on Gaussian mutation operators to generate new individuals. When the standard deviation is too large or small, this approach will result in the production of poor or highly similar offspring. Such outcomes can be detrimental to the learning process of the RL agent, as too many poor or similar experiences are generated by these individuals. In order to alleviate these issues, this paper proposes an Adaptive Evolutionary Reinforcement Learning (AERL) method that adaptively adjusts both the standard deviation and the evaluation process. By tracking the performance of new individuals, AERL maintains the mutation strength within a suitable range without the need for additional gradient computations. Moreover, the proposed AERL approach early terminates unnecessary evaluations and discards experiences arising from poor individuals, thereby resulting in enhanced learning efficiency. Empirical assessments conducted on a variety of continuous control problems demonstrate the effectiveness of the AERL method.
The attention-based neural network attracts great interest due to its excellent accuracy enhancement. However, the attention mechanism requires huge computational efforts to process unnecessary calculations, significa...
详细信息
Heterogeneous memory systems have become increasingly popular in recent years. Because heterogeneous storage media often show significantly different characteristics in terms of bandwidth, latency, capacity, and energ...
详细信息
Heterogeneous memory systems have become increasingly popular in recent years. Because heterogeneous storage media often show significantly different characteristics in terms of bandwidth, latency, capacity, and energy consumption, it is still challenging to best utilize them for cost-efficient and energy-efficient heterogeneous memory systems. In this paper, we propose a simulation framework for multi-tiered heterogeneous memory architectures based on GEM5 and DRAMsim3 simulators. We design a heterogeneous memory controller to architect Non-Volatile Memory (NVM) as main memory, and architect both Dynamic Random Access Memory (DRAM) and High-Bandwidth Memory (HBM) as a hybrid cache of NVM. Specifically, HBM, DRAM, and NVM are managed in a single (flat) address space. However, we use an address remapping table to maintain the mappings between NVM pages and HBM/DRAM pages, and logically manage HBM/DRAM/NVM as a three-tiered hybrid memory system. We also design a hardware-supported hot page monitor based on Majority Element Algorithm (MEA) to identify the hottest pages in the DRAM, and a dynamic threshold adjustment scheme for hot page migration to balance the memory bandwidth between DRAM and HBM. Our multi-tiered heterogeneous memory architecture can take advantage of the large capacity of NVM, the low latency of DRAM, and the high bandwidth of HBM concurrently. Experimental results show that our tiered memory architecture can improve application performance by an average of $2.5\times$ compared with an NVM-only architecture, and up to 57.4% compared with a DRAM-only architecture. Moreover, the performance gap between our HBM/DRAM/NVM architecture and a HBM-only architecture is less than 10%.
暂无评论