检索结果-内蒙古大学图书馆

Intelligent Systems Conference (IntelliSys)

作者： Yuan, Zhengqing Xue, Huiwen Zhang, Chao Liu, Yongming Anhui Polytech Univ Sch Artificial Intelligence Wuhu 241009 Peoples R China Soochow Univ Sch Optoelect Sci & Engn Suzhou 215031 Peoples R China

ISBN: (纸本)9783031477201;9783031477218

Large deep learning models have shown great potential for delivering exceptional results in various applications. However, the training process can be incredibly challenging due to the models' vast parameter sizes, often consisting of hundreds of billions of parameters. Common distributed training methods, such as data parallelism, tensor parallelism, and pipeline parallelism, demand significant data communication throughout the process, leading to prolonged wait times for some machines in physically distant distributed systems. To address this issue, we propose a novel solution called Hulk, which utilizes a modified graph neural network to optimize distributed computing systems. Hulk not only optimizes data communication efficiency between different countries or even different regions within the same city, but also provides optimal distributed deployment of models in parallel. For example, it can place certain layers on a machine in a specific region or pass specific parameters of a model to a machine in a particular location. By using Hulk in experiments, we were able to improve the time efficiency of training large deep learning models on distributed systems by more than 20%. Our open source collection of unlabeled data: https://***/DLYuanGod/Hulk.

关键词： Optimize Communication Efficiency Distributed Training parallel deployment Time Efficiency

来源：评论

学校读者我要写书评

暂无评论

Efficient and deterministic application deployment in component-based enterprise distributed real-time and embedded systems

引用

INFORMATION AND SOFTWARE TECHNOLOGY 2013年第2期55卷 475-488页

作者： Otte, William R. Gokhale, Aniruddha Schmidt, Douglas C. Vanderbilt Univ Inst Software Integrated Syst EECS Nashville TN 37212 USA

Context: Component-based middleware, such as the Lightweight CORBA Component Model, is increasingly used to implement enterprise distributed real-time and embedded (DRE) systems. In addition to supporting the quality-of-service (QoS) requirements of individual DRE systems, component technologies must also support bounded latencies when effecting deployment changes to DRE systems in response to changing environmental conditions and operational requirements. Objective: The goals of this paper are to (1) study sources of inefficiencies and non-deterministic performance in deployment capabilities for DRE systems and (2) devise solutions to overcome these performance problems. Method: The paper makes two contributions to the study of the deployment and configuration of distributed component based applications. First, we analyze how conventional implementations of the OMG's deployment and Configuration (D&C) specification for component-based systems can significantly degrade deployment latencies. Second, we describe architectural changes and performance optimizations implemented within the Locality-Enhanced deployment and Configuration Engine (LE-DAnCE) implementation of the D&C specification to obtain efficient and deterministic deployment latencies. Results: We analyze the performance of LE-DAnCE in the context of component deployments on 10 nodes for a representative DRE system consisting of 1000 components and in a cluster environment with up to 100 nodes. Our results show LE-DAnCE's optimizations provide a bounded deployment latency of less than 2 s for the 1000 component scenario with just a 4 percent jitter. Conclusion: The improvements contained in the LE-DAnCE infrastructure provide an efficient and scaleable standards-based deployment system for component-based enterprise DRE systems. In particular, deployment time parallelism can improve deployment latency significantly, both during pre-deployment analysis of the deployment plan and during the process of instal

关键词： Component-based real-time systems Deterministic deployment deployment and configuration Efficient deployment parallel deployment deployment latency

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：