检索结果-内蒙古大学图书馆

Fast Deep Neural Network Training on distributed Systems and Cloud TPUs

IEEE TRANSACTIONS ON parallel AND distributed SYSTEMS 2019年第11期30卷 2449-2462页

作者： You, Yang Zhang, Zhao Hsieh, Cho-Jui Demmel, James Keutzer, Kurt Univ Calif Berkeley Div Comp Sci Berkeley CA 94720 USA Univ Texas Austin Texas Adv Comp Ctr Austin TX 78705 USA Univ Calif Los Angeles Dept Comp Sci Los Angeles CA 90007 USA

Since its creation, the ImageNet-1k benchmark set has played a significant role as a benchmark for ascertaining the accuracy of different deep neural net (DNN) models on the image classification problem. Moreover, in recent years it has also served as the principal benchmark for assessing different approaches to DNN training. Finishing a 90-epoch ImageNet-1k training with ResNet-50 on a NVIDIA M40 GPU takes 14 days. This training requires 10(18) single precision operations in total. On the other hand, the world's current fastest supercomputer can finish 3 x 10(17) single precision operations per second (according to the Nov 2018 Top 500 results). If we can make full use of the computing capability of the fastest supercomputer, we should be able to finish the training in several seconds. Over the last two years, researchers have focused on closing this significant performance gap through scaling DNN training to larger numbers of processors. Most successful approaches to scaling ImageNet training have used the synchronous mini-batch stochastic gradient descent (SGD). However, to scale synchronous SGD one must also increase the batch size used in each iteration. Thus, for many researchers, the focus on scaling DNN training has translated into a focus on developing training algorithms that enable increasing the batch size in data-parallel synchronous SGD without losing accuracy over a fixed number of epochs. In this paper, we investigate supercomputers' capability of speeding up DNN training. Our approach is to use a large batch size, powered by the Layer-wise Adaptive Rate Scaling (LARS) algorithm, for efficient usage of massive computing resources. Our approach is generic, as we empirically evaluate the effectiveness on five neural networks: AlexNet, AlexNet-BN, GNMT, ResNet-50, and ResNet-50-v2 trained with large datasets while preserving the state-of-the-art test accuracy. Compared to the baseline of a previous study from Goyal et al. [1] , our approach shows higher

关键词： Training Program processors Google parallel processing Synchronization Neural networks Supercomputers Fast algorithm deep learning parallel & distributed processing

来源：评论

学校读者我要写书评

暂无评论

An efficient distributed MMOG server using 2Layer-Cell method

引用

2nd International Conference on Technologies for E-Learning and Digital Entertainment (Edutainment 2007)

作者： Jang, Su-Min Yoo, Jae-Soo Chungbuk Natl Univ Dept Comp & Commun Engn 48 Gaesin Dong Cheongju South Korea

ISBN: (纸本)9783540730101

In recent years, thousands or even hundreds of thousands players interact with each other in the MMOG (Massively Multi-player Online Game). Therefore MMOG servers have the problem with scalability. To overcome this problem, we propose a new method for MMOG distributed server, denoted as 2Layer-Cell method. 2Layer-Cell method constructed with Upper-Layer and Down-Layer. The Upper-Layer includes important aggregated information of game objects such as virtual space, users, monster, etc. And the Down-layer includes real data of game objects. This paper makes the following contributions. First, it captures these problems of high storage cost and slow processing time for previous methods. Second, it proposes parallel processing strategies that aim to reduce process time. Third, it proposes an efficient partitioning algorithm for distributed servers. Our experiment results show that our method is better scalable than existing methods.

关键词： MMOG game server partitioning algorithm scalability issue parallel & distributed processing communication system traffic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：