版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Pohang Univ Sci & Technol POSTECH Dept Mat Sci & Engn Pohang 37673 South Korea
出 版 物:《ADVANCED INTELLIGENT SYSTEMS》 (Adv. Intell. Syst.)
年 卷 期:2025年第7卷第5期
核心收录:
基 金:Ministry of Trade, Industry and Energy [1415187475, 20024760] K-CHIPS (Korea Collaborative & High-tech Initiative for Prospective Semiconductor Research) Ministry of Trade, Industry & Energy (MOTIE, Korea)
主 题:analog in-memory computing deep learning accelerator device specification neural network Tiki-Taka algorithm
摘 要:Recently, specialized training algorithms for analog cross-point array-based neural network accelerators have been introduced to counteract device non-idealities such as update asymmetry and cycle-to-cycle variation, achieving software-level performance in neural network training. However, a quantitative analysis of how these algorithms affect the relaxation of device specifications is yet to be conducted. This study provides a detailed analysis by elucidating the device prerequisites for training with the Tiki-Taka algorithm versions 1 (TTv1) and 2 (TTv2), which leverage the dynamics between multiple arrays to compensate for device non-idealities. A multiparameter simulation is conducted to assess the impact of device non-idealities, including asymmetry, retention, number of pulses, and cycle-to-cycle variation, on neural network training. Using pattern-recognition accuracy as a performance metric, the required device specifications for each algorithm are revealed. The results demonstrate that the standard stochastic gradient descent algorithm requires stringent device specifications. Conversely, TTv2 permits more lenient device specifications than the TTv1 across all examined non-idealities. The analysis provides guidelines for the development, optimization, and utilization of devices for high-performance neural network training using Tiki-Taka algorithms. This study investigates the device specifications required for neural network training using analog resistive cross-point arrays with the training algorithms. By demonstrating the robustness against non-ideal update characteristics with these algorithms, it quantitatively shows how hardware-aware training can relax device specifications. It could pave the way for successful implementation of analog deep learning accelerators with actual *** (c) 2024 WILEY-VCH GmbH