文献详情 >MARS: Multimacro Architecture ... 收藏

MARS: Multimacro Architecture SRAM CIM-Based Accelerator With Co-Designed Compressed Neural Networks

作者：Sie, Syuan-Hao Lee, Jye-Luen Chen, Yi-Ren Yeh, Zuo-Wei Li, Zhaofang Lu, Chih-Cheng Hsieh, Chih-Cheng Chang, Meng-Fan Tang, Kea-Tiong

作者机构：Natl Tsing Hua Univ Hsinchu 30013 Taiwan Ind Technol Res Inst Informat & Commun Labs Chutung 31030 Taiwan

出版物：《IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS》 (IEEE Trans Comput Aided Des Integr Circuits Syst)

年卷期：2022年第41卷第5期

页面：1550-1562页

核心收录：

学科分类：0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术（可授工学、理学学位）]

基　　金：Ministry of Science and Technology Taiwan [MOST 109-2218-E-007-019 MOST 109-2262-8-007-022]

主　　题：Random access memory Hardware Computer architecture Quantization (signal) Training Common Information Model (computing) Software Compression algorithm computing-in-memory (CIM) deep learning quantization

摘要：Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the large storage overheads and the substantial computational cost of CNNs are problematic in hardware accelerators. Computing-in-memory (CIM) architecture has demonstrated great potential to effectively compute large-scale matrix-vector multiplication. However, the intensive multiply and accumulation (MAC) operations executed on CIM macros remain bottlenecks for further improvement of energy efficiency and throughput. To reduce computational costs, model compression is a widely studied method to shrink the model size. For implementation in a static random access memory (SRAM) CIM-based accelerator, the model compression algorithm must consider the hardware limitations of CIM macros. In this study, a software and hardware co-design approach is proposed to design MARS, a SRAM-based CIM (SRAM CIM)-based CNN accelerator that can utilize multiple SRAM CIM macros as processing units and support a sparse CNN, and an SRAM CIM-aware model compression algorithm that considers a CIM architecture to reduce the number of network parameters. With the proposed hardware software co-designed method, MARS can reach over 700 and 400 FPS for CIFAR-10 and CIFAR-100, respectively. In addition, MARS achieves 52.3 and 88.2 TOPs/W in VGG16 and ResNet18, respectively.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

MARS: Multimacro Architecture SRAM CIM-Based Accelerator With Co-Designed Compressed Neural Networks

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

MARS: Multimacro Architecture SRAM CIM-Based Accelerator With Co-Designed Compressed Neural Networks

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：