版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Edinburgh Inst Integrated Micro & Nano Syst Ctr Elect Frontiers Sch Engn Edinburgh EH9 3BF Scotland
出 版 物:《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS》 (IEEE Trans. Circuits Syst. Regul. Pap.)
年 卷 期:2025年第72卷第5期
页 面:2263-2273页
核心收录:
基 金:Engineering and Physical Sciences Research Council (EPSRC) Programme Grant "Functional Oxide Reconfigurable Technologies" (FORTE) [EP/R024642/2] RAEng Chair in Emerging Technologies [CiET1819/2/93]
主 题:Artificial intelligence convolutional neural networks systolic arrays field programmable gate arrays memory accesses energy efficiency
摘 要:Modern hardware architectures for Convolutional Neural Networks (CNNs), other than targeting high performance, aim at dissipating limited energy. Reducing the data movement cost between the computing cores and the memory is a way to mitigate the energy consumption. Systolic arrays are suitable architectures to achieve this objective: they use multiple processing elements that communicate each other to maximize data utilization, based on proper dataflows like the weight stationary and row stationary. Motivated by this, we have proposed TrIM, an innovative dataflow based on a triangular movement of inputs, and capable to reduce the number of memory accesses by one order of magnitude when compared to state-of-the-art systolic arrays. In this paper, we present a TrIM-based hardware architecture for CNNs. As a showcase, the accelerator is implemented onto a Field Programmable Gate Array (FPGA) to execute the VGG-16 and AlexNet CNNs. The architecture achieves a peak throughput of 453.6 Giga Operations per Second, outperforming a state-of-the-art row stationary systolic array up to similar to 3x in terms of memory accesses, and being up to similar to 11.9 x more energy-efficient than other FPGA accelerators.