检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

2 篇 会议

馆藏范围

2 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2 篇 工学
- 2 篇 计算机科学与技术...
- 1 篇 软件工程

主题

2 篇 architecture
2 篇 computation/data...
2 篇 parallel algorit...
1 篇 accelerator
1 篇 algorithm-hardwa...
1 篇 memory networks
1 篇 natural language...
1 篇 machine learning
1 篇 attention-based ...

机构

2 篇 seoul natl univ ...
1 篇 postech pohang d...

作者

2 篇 kim jangwoo
2 篇 kim joonsung
1 篇 jo jae-eon
1 篇 lee eunbok
1 篇 jang hanhwi
1 篇 hur suyeon
1 篇 lee seungho
1 篇 lee jaewon

语言

2 篇 英文

检索条件"主题词=Computation/Dataflow Optimization"

共 2 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models 30

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerat...

引用

30th International Conference on Parallel Architectures and Compilation Techniques (PACT)

作者： Kim, Joonsung Hur, Suyeon Lee, Eunbok Lee, Seungho Kim, Jangwoo Seoul Natl Univ Dept Elect & Comp Engn Seoul South Korea

ISBN: (纸本)9781665442787

Emerging natural language processing (NLP) models have become more complex and bigger to provide more sophisticated NLP services. Accordingly, there is also a strong demand for scalable and flexible computer infrastructure to support these large-scale, complex, and diverse NLP models. However, existing proposals cannot provide enough scalability and flexibility as they neither identify nor optimize a wide spectrum of performance-critical operations appearing in recent NLP models and only focus on optimizing specific operations. In this paper, we propose NLP-Fast, a novel system solution to accelerate a wide spectrum of large-scale NLP models. NLP-Fast mainly consists of two parts: (1) NLP-Perf : an in-depth performance analysis tool to identify critical operations in emerging NLP models and (2) NLP-Opt: three end-to-end optimization techniques to accelerate the identified performance-critical operations on various hardware platforms (e.g., CPU, GPU, FPGA). In this way, NLP-Fast can accelerate various types of NLP models on different hardware platforms by identifying their critical operations through NLP-Perf and applying the NLP-Opt's holistic optimizations. We evaluate NLP-Fast on CPU, GPU, and FPGA, and the overall throughputs are increased by up to 2.92x, 1.59x, and 4.47x over each platform's baseline. We release NLP-Fast to the community so that users are easily able to conduct the NLP-Fast's analysis and apply NLP-Fast's optimizations for their own NLP applications.

关键词： Architecture computation/dataflow optimization Natural Language Processing (NLP) Parallel Algorithm

来源：评论

学校读者我要写书评

暂无评论

MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks 19

MnnFast: A Fast and Scalable System Architecture for Memory-...

引用

46th International Symposium on Computer Architecture (ISCA) / Workshop on Computer Architecture Education (WCAE)

作者： Jang, Hanhwi Kim, Joonsung Jo, Jae-Eon Lee, Jaewon Kim, Jangwoo POSTECH Pohang Dept Comp Sci & Engn Pohang South Korea Seoul Natl Univ Dept Elect & Comp Engn Seoul South Korea

ISBN: (纸本)9781450366694

Memory-augmented neural networks are getting more attention from many researchers as they can make an inference with the previous history stored in memory. Especially, among these memory-augmented neural networks, memory networks are known for their huge reasoning power and capability to learn from a large number of inputs rather than other networks. As the size of input datasets rapidly grows, the necessity of large-scale memory networks continuously arises. Such large-scale memory networks provide excellent reasoning power;however, the current computer infrastructure cannot achieve scalable performance due to its limited system architecture. In this paper, we propose MnnFast, a novel system architecture for large-scale memory networks to achieve fast and scalable reasoning performance. We identify the performance problems of the current architecture by conducting extensive performance bottleneck analysis. Our in-depth analysis indicates that the current architecture suffers from three major performance problems: high memory bandwidth consumption, heavy computation, and cache contention. To overcome these performance problems, we propose three novel optimizations. First, to reduce the memory bandwidth consumption, we propose a new column-based algorithm with streaming which minimizes the size of data spills and hides most of the offchip memory accessing overhead. Second, to decrease the high computational overhead, we propose a zero-skipping optimization to bypass a large amount of output computation. Lastly, to eliminate the cache contention, we propose an embedding cache dedicated to efficiently cache the embedding matrix. Our evaluations show that MnnFast is significantly effective in various types of hardware: CPU, GPU, and FPGA. MnnFast improves the overall throughput by up to 5.38x, 4.34x, and 2.01x on CPU, GPU, and FPGA respectively. Also, compared to CPU-based MnnFast, our FPGA-based MnnFast achieves 6.54x higher energy efficiency.

关键词： Memory Networks Attention-based Neural Networks Machine Learning Parallel Algorithm computation/dataflow optimization Accelerator Algorithm-Hardware Co-Design Architecture

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共1页 << < 1 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：