检索结果-内蒙古大学图书馆

22nd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)

作者： Jiang, Zhehao Ling, Neiwen Huang, Xuan Shi, Shuyao Wu, Chenhao Zhao, Xiaoguang Yan, Zhenyu Xing, Guoliang Chinese Univ Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9798400701184

Recent years have witnessed the emergence of a new class of cooperative edge systems in which a large number of edge nodes can collaborate through local peer-to-peer connectivity. In this paper, we propose CoEdge, a novel cooperative edge system that can support concurrent data/compute-intensive deep learning (DL) models for distributed real-time applications such as city-scale traffic monitoring and autonomous driving. First, CoEdge includes a hierarchical DL task scheduling framework that dispatches DL tasks to edge nodes based on their computational profiles, communication overhead, and real-time requirements. Second, CoEdge can dramatically increase the execution efficiency of DL models by batching sensor data and aggregating the inferences of the same model. Finally, we propose a new edge containerization approach that enables an edge node to execute concurrent DL tasks by partitioning the CPU and GPU workloads into different containers. We extensively evaluate CoEdge on a self-deployed smart lamppost testbed on a university campus. Our results show that CoEdge can achieve up to 82.32% reduction on deadline missing rate compared to baselines.

关键词： Smart City Edge Computing distributed deep learning system Real-time Scheduling Edge Containerization

来源：评论

学校读者我要写书评

暂无评论

Joint Dynamic Data and Model Parallelism for distributed Training of DNNs Over Heterogeneous Infrastructure

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed systemS 2025年第2期36卷 150-167页

作者： Ling, Zhi Jiang, Xiaofeng Tan, Xiaobin He, Huasen Zhu, Shiyin Yang, Jian Univ Sci & Technol China Dept Automat Hefei 230027 Peoples R China Inst Artificial Intelligence Hefei Comprehens Natl Sci Ctr Hefei 230026 Peoples R China New H3C Technol Co Ltd Beijing 100102 Peoples R China

distributed training of deep neural networks (DNNs) suffers from efficiency declines in dynamic heterogeneous environments, due to the resource wastage brought by the straggler problem in data parallelism (DP) and pipeline bubbles in model parallelism (MP). Additionally, the limited resource availability requires a trade-off between training performance and long-term costs, particularly in online settings. To address these challenges, this article presents a novel online approach to maximize long-term training efficiency in heterogeneous environments through uneven data assignment and communication-aware model partitioning. A group-based hierarchical architecture combining DP and MP is developed to balance discrepant computation and communication capabilities, and offer a flexible parallel mechanism. In order to jointly optimize the performance and long-term cost of the online DL training process, we formulate this problem as a stochastic optimization with time-averaged constraints. By utilizing Lyapunov's stochastic network optimization theory, we decompose it into several instantaneous sub-optimizations, and devise an effective online solution to address them based on tentative searching and linear solving. We have implemented a prototype system and evaluated the effectiveness of our solution based on realistic experiments, reducing batch training time by up to 68.59% over state-of-the-art methods.

关键词： Training Data models Pipelines Optimization Synchronization Parallel processing Scalability Partitioning algorithms Costs Computational modeling distributed deep learning system data parallel model parallel data assignment model partitioning

来源：评论

学校读者我要写书评

暂无评论

Momentum-driven adaptive synchronization model for distributed DNN training on HPC clusters

引用

JOURNAL OF PARALLEL AND distributed COMPUTING 2022年 159卷 65-84页

作者： Zhang, Zhaorui Ji, Zhuoran Wang, Choli Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China

Building a distributed deep learning (DDL) system on HPC clusters that guarantees convergence speed and scalability for the training of DNNs is challenging. The HPC cluster, which consists of multiple high density multi-GPU servers connected by the Infiniband network (HDGib), compresses the computing and communication time for distributed DNNs' training but brings new challenges. The convergence time is far from linear scalability (with respect to the number of workers) for parallel DNNs training. We thus analyze the optimization process and identify three key issues that cause scalability degradation. First, the high-frequency update for parameters due to the compression of the computing and communication times exacerbates the stale gradient problem, which slows down the convergence. Second, the previous methods used to constrain the gradient noise (stochastic error) of the SGD are outdated, as HDGib clusters can support more strict constraints due to the Infiniband network connections, which can further constrain the stochastic error. Third, the same learning rate for all workers is inefficient due to the different training stages of each worker. We thus propose a momentum-driven adaptive synchronization model that focuses on solving the above issues and accelerating the training procedure on HDGib clusters. Our adaptive k-synchronization algorithm uses the momentum term to absorb the stale gradients and adaptively bind the stochastic error to provide an approximate optimal descent direction for the distributed SGD. Our model also includes an individual dynamic learning rate search method for each worker to further improve training performance. Compared with previous linear and exponent decay methods, it can provide a more precise descent distance for distributed SGD based on different training stages. Extensive experimental results indicate that the proposed model effectively improves the training performance of CNNs, which retains high accuracy with a speed-up o

关键词： distributed deep learning system Momentum-driven Adaptive Synchronization model High-performance computing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：