检索结果-内蒙古大学图书馆

A Review of FPGA-Based Custom Computing Architecture for convolutional neural network inference

Chinese Journal of Electronics 2021年第1期30卷 1-17页

作者： PENG Xiyuan YU Jinxiang YAO Bowen LIU Liansheng PENG Yu School of Electronics and Information Engineering Harbin Institute of Technology

convolutional neural network(CNN)has been widely adopted in many tasks. Its inference process is usually applied on edge devices where the computing resources and power consumption are *** present, the performance of general processors cannot meet the requirement for CNN models with high computation complexity and large number of parameters. Field-programmable gate array(FPGA)-based custom computing architecture is a promising solution to further enhance the CNN inference *** software/hardware co-design can effectively reduce the computing overhead, and improve the inference performance while ensuring accuracy. In this paper, the mainstream methods of CNN structure design, hardwareoriented model compression and FPGA-based custom architecture design are summarized, and the improvement of CNN inference performance is demonstrated through an example. Challenges and possible research directions in the future are concluded to foster research efforts in this domain.

关键词： inference mechanisms computing resources hardware-oriented model compression CNN structure design computing overhead convolutional neural nets electronic engineering computing field-programmable gate array-based custom computing architecture computation complexity FPGA-based custom computing architecture design computational complexity convolutional neural network inference CNN inference performance power consumption field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

High-Performance Mixed-Low-Precision CNN inference Accelerator on FPGA

引用

IEEE MICRO 2021年第4期41卷 31-38页

作者： Wang, Junbin Fang, Shaoxia Wang, Xi Ma, Jiangsha Wang, Taobo Shan, Yi Xilinx Inc Beijing 100029 Peoples R China

Low-precision techniques can effectively reduce the computational complexity and bandwidth requirements of a convolutional neural network (CNN) inference, but may lead to significant accuracy degradation. Mixed-low-precision techniques provide a superior approach for CNN inference since it can take the advantages of low precision while maintaining accuracy. In this article, we propose a high-performance, highly flexible W(8)A(8) (INT8 weight and INT8 activation) and W(T)A(2) (TERNARY weight and INT2 activation) mixed-precision CNN inference hardware architecture, DPUmxp, designed and implemented on Xilinx Virtex UltraScale+13P FPGA with peak performance up to 58.9 TOPS.

关键词： Computational Complexity convolutional neural Nets Field Programmable Gate Arrays Logic Design neural Nets FPGA High Performance Mixed Low Precision CNN inference Accelerator Computational Complexity Bandwidth Requirements convolutional neural network inference Mixed Low Precision Techniques INT 8 Weight INT 8 Activation TERNARY Weight Mixed Precision CNN inference Hardward Architecture High Performance Computing Convolution Quantization Signal Ports Computers Buffer Storage Hardware Computer Architecture Precision Engineering

来源：评论

学校读者我要写书评

暂无评论

Very-large-scale integration implementation of a convolutional neural network accelerator for abnormal heartbeat detection

引用

ELECTRONICS LETTERS 2020年第7期56卷 330-+页

作者： Chen, Y. -H. Juan, Y. Chang Gung Univ Dept Elect Engn Taoyuan Taiwan Chang Gung Mem Hosp LinKou Inst Radiol Res Dept Radiat Oncol Taoyuan Taiwan

In this study, a very-large-scale integration implementation of a convolutional neural network (CNN) inference for abnormal heartbeat detection was proposed. Four-lead electrocardiogram signals were used to detect abnormal heartbeat conditions, such as premature ventricular complex. 1D CNNs and fully connected layers were utilised in the proposed chip to achieve high-speed, small-area, and high-accuracy arrhythmia detection. The proposed chip was implemented using a 90-nm complementary metal-oxide-semiconductor process and operated at 125 MHz with a 0.67mm(2) core area. The power consumption was 4.18mW at high-speed operation frequency (125 MHz) and 3.79 mu W at 10 kHz for low-power applications. The detection accuracy was 95.14% based on the MIT-BIH arrhythmia database. Consequently, the properties of high speed, low power, small area, and high accuracy were established in the proposed accelerator chip.

关键词： medical signal processing convolutional neural network inference fully connected layers high-speed operation frequency medical signal detection detection accuracy electrocardiography four-lead electrocardiogram signals high-accuracy arrhythmia detection neural nets abnormal heartbeat detection complementary metal-oxide-semiconductor process premature ventricular complex convolutional neural network accelerator very-large-scale integration implementation accelerator chip frequency 125.0 MHz

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：