检索结果-内蒙古大学图书馆

Improving model robustness to weight noise via consistency regularization

MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2024年第3期5卷 035065页

作者： Hou, Yaoqi Zhang, Qingtian Wang, Namin Wu, Huaqiang Beijing Adv Innovat Ctr Integrated Circuits Beijing Peoples R China Tsinghua Univ Sch Integrated Circuits Beijing Peoples R China

As an emerging computing architecture, the computing-in-memory (CIM) exhibits significant potential for energy efficiency and computing power in artificial intelligence applications. However, the intrinsic non-idealities of CIM devices, manifesting as random interference on the weights of neural network, may significantly impact the inference accuracy. In this paper, we propose a novel training algorithm designed to mitigate the impact of weight noise. The algorithm strategically minimizes cross-entropy loss while concurrently refining the feature representations in intermediate layers to emulate those of an ideal, noise-free network. This dual-objective approach not only preserves the accuracy of the neural network but also enhances its robustness against noise-induced degradation. Empirical validation across several benchmark datasets confirms that our algorithm sets a new benchmark for accuracy in CIM-enabled neural network applications. Compared to the most commonly used forward noise training methods, our approach yields approximately a 2% accuracy boost on the ResNet32 model with the CIFAR-10 dataset and a weight noise scale of 0.2, and achieves a minimum performance gain of 1% on ResNet18 with the ImageNet dataset under the same noise quantization conditions.

关键词： computing in memory parameter perturbation weight noise dual-objective optimization

来源：评论

学校读者我要写书评

暂无评论

A Study on the Channel Holes' Diameter Effects of High-Performance Vertical-Channel Flash memory Cells

引用

ECS JOURNAL OF SOLID STATE SCIENCE AND TECHNOLOGY 2024年第10期13卷

作者： Yan, Zijin Zhu, Huilong Yang, Hong Li, Junjie Lu, Shunshun Zhang, Chenchen Yang, Shuai Bai, Tianyu Zhao, Kaiqiang Xiang, Liang Zhang, Yongkui Li, Junfeng Luo, Jun Ye, T. C. Chinese Acad Sci Inst Microelect Integrated Circuit Adv Proc R&D Ctr Beijing 100029 Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China

High-performance vertical-channel flash (HVF) memory cells were fabricated on the single crystalline Si (c-Si) sidewalls of the cylindrical deep wells in c-Si substrate. To investigate the diameter effects of the cylindrical deep wells, namely channel holes, on HVF cells, the channel holes with different diameters, ranging from 65 nm to 260 nm, were made. memory gate stacks of SiO2/Al2O3/HfO2/Al2O3/TiN/W were formed by ozone oxidation and then ALD with the deposition thicknesses of 1/5/7/8/2/150 nm, respectively. For the devices with their diameters equal to or greater than 150 nm, their electrical properties, such as Vt, SS, DIBL, and program/erase characteristics, are close. As expected, DIBL and SS become better as the diameter increasing due to better gate control with larger diameter. However, large changes were occurred for the devices with the diameters of 90 nm and 65 nm. A simple model based on cylinder bulk for vertical flash memory devices was presented to obtain an approximate analytical solution for depletion-width and explain our experimental data. For the devices with the diameters of 150 nm, the high On/Off current ratio of 107 and relatively large memory window of 4.5 V were achieved. However, programming/erasing efficiency were degraded with hole diameter decreasing.

关键词： 3D memory three-dimensional stacking NOR flash computing in memory diameter effect

来源：评论

学校读者我要写书评

暂无评论

SRAM-Based CIM Architecture Design for Event Detection

引用

SENSORS 2022年第20期22卷 7854页

作者： Sulaiman, Muhammad Bintang Gemintang Lin, Jin-Yu Li, Jian-Bai Shih, Cheng-Ming Juang, Kai-Cheung Lu, Chih-Cheng Ind Technol Res Inst 195Sect 4Zhongxing Rd Hsinchu 310401 Taiwan

Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the high computational complexity and high-energy consumption of CNNs trammel their application in hardware accelerators. computing-in-memory (CIM) is the technique of running calculations entirely in memory (in our design, we use SRAM). CIM architecture has demonstrated great potential to effectively compute large-scale matrix-vector multiplication. CIM-based architecture for event detection is designed to trigger the next stage of precision inference. To implement an SRAM-based CIM accelerator, a software and hardware co-design approach must consider the CIM macro's hardware limitations to map the weight onto the AI edge devices. In this paper, we designed a hierarchical AI architecture to optimize the end-to-end system power in the AIoT application. In the experiment, the CIM-aware algorithm with 4-bit activation and 8-bit weight is examined on hand gesture and CIFAR-10 datasets, and determined to have 99.70% and 70.58% accuracy, respectively. A profiling tool to analyze the proposed design is also developed to measure how efficient our architecture design is. The proposed design system utilizes the operating frequency of 100 MHz, hand gesture and CIFAR-10 as the datasets, and nine CNNs and one FC layer as its network, resulting in a frame rate of 662 FPS, 37.6% processing unit utilization, and a power consumption of 0.853 mW.

关键词： artificial internet of things computing in memory convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

CINT -- An Energy-efficient Mixed-signal In-memory CNN Accelerator Based on NOR Flash memory (poster) 19

CINT -- An Energy-efficient Mixed-signal In-Memory CNN Accel...

引用

Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

作者： Linfeng Tao Rui Xu Teng Tian Zikun Xiang Yifei Li Xi Jin Jun Ren Zhengda Li Chenxia Li University of Science and Technology of China Hefei China Zbit Semi Inc. Hefei China

Convolutional neural network (CNN) is a power-hungry and resource-consuming application, which makes it hard to deploy on end devices. We propose a method to perform convolution operations in NOR flash memory. Experim... 详细信息

ISBN: (纸本)9781450366618

关键词： nor flash neural network computing in memory

来源：评论

学校读者我要写书评

暂无评论

Deep Neural Network accelerator with Spintronic memory 20

Deep Neural Network accelerator with Spintronic Memory

引用

Proceedings of the 2020 on Great Lakes Symposium on VLSI

作者： He Zhang Wang Kang Youguang Zhang Weisheng Zhao Beihang University Beijing China

ISBN: (纸本)9781450379441

Utilizing emerging nonvolatile memories to accelerate deep neural network (DNN) has been considered as one of the promising approaches to solve the bottleneck of data transfer during the multiplication and accumulation (MAC). Among them, spintronic memories show tempting prospect due to their low access power, fast access speed, high density, and relatively mature process. As shown in fig.1, according to the principle to achieve DNN computing, it can be mainly divided into three different technical routes. The first one is an "analog" method [1, 2], as shown in fig.1(a). By transforming the digital input signals into multi-level voltage signals, and applying them to different columns of the memory array, the MAC results can be obtained in different columns with current integrator and analog to digital converter (ADC). Besides, the WL drivers can control the pulse width of different rows, to achieve the effect of multi-bit weights. This method can theoretically achieve high energy efficiency and computing speed. However, the variation of magnetic tunnel junction (MTJ) may have influence on the computing accuracy. Besides, the power consumption and area overhead of the ADC are also challenging. The other two methods are in a "digital" way, and they realize MAC computing through row-by-row read/write operation. Fig.1(b) shows the second reading-based method [3]. The weights of the neural network are stored in the memory cell. By putting the input signal to the modified sensing amplifier (SA), it can also achieve XOR function, which is the core of binary NN, with the content stored in the memory cell. Nevertheless, the modification to the SA is usually to add extra transistors in the read path, which will increase the bit error rate. Fig.1(c) shows the diagram of the last one, which is based on the "stateful logic" [4]. The input data is sent to the modified write driver when the WL receiving weight signals from outside I/O. Based on a unique logic paradigm, it can real

关键词： spin memories multiplication and accumulation computing in memory DNN accelerator

来源：评论

学校读者我要写书评

暂无评论

Reliability aspects of binary vector-matrix-multiplications using ReRAM devices

引用

NEUROMORPHIC computing AND ENGINEERING 2022年第3期2卷 034001-034001页

作者： Bengel, Christopher Mohr, Johannes Wiefels, Stefan Singh, Abhairaj Gebregiorgis, Anteneh Bishnoi, Rajendra Hamdioui, Said Waser, Rainer Wouters, Dirk Menzel, Stephan Rhein Westfal Techn Hsch RWTH Aachen Univ Inst Mat Elect Engn & Informat Technol 2 Aachen Germany Rhein Westfal Techn Hsch RWTH Aachen Univ Julich Aachen Res Alliance JARA Fit Aachen Germany Forschungszentrum Julich Peter Grunberg Inst PGI 7 Julich Germany JARA FIT Julich Germany Delft Univ Technol Comp Engn Dept NL-2628 CD Delft Netherlands Forschungszentrum Julich Peter Grunberg Inst PGI 10 Julich Germany

Computation-in-memory using memristive devices is a promising approach to overcome the performance limitations of conventional computing architectures introduced by the von Neumann bottleneck which are also known as memory wall and power wall. It has been shown that accelerators based on memristive devices can deliver higher energy efficiencies and data throughputs when compared with conventional architectures. In the vast multitude of memristive devices, bipolar resistive switches based on the valence change mechanism (VCM) are particularly interesting due to their low power operation, non-volatility, high integration density and their CMOS compatibility. While a wide range of possible applications is considered, many of them such as artificial neural networks heavily rely on vector-matrix-multiplications (VMMs) as a mathematical operation. These VMMs are made up of large numbers of multiplication and accumulation (MAC) operations. The MAC operation can be realised using memristive devices in an analog fashion using Ohm's law and Kirchhoff's law. However, VCM devices exhibit a range of non-idealities, affecting the VMM performance, which in turn impacts the overall accuracy of the application. Those non-idealities can be classified into time-independent (programming variability) and time-dependent (read disturb and read noise). Additionally, peripheral circuits such as analog to digital converters can introduce errors during the digitalization. In this work, we experimentally and theoretically investigate the impact of device- and circuit-level effects on the VMM in a VCM crossbars. Our analysis shows that the variability of the low resistive state plays a key role and that reading in the RESET direction should be favored to reading in the SET direction.

关键词： computing in memory memristor ReRAM circuit design compact modelling vector-matrix-multiplication dot-product

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：