Deep neuralnetworks (DNNs) are widely used for real-world applications. However, large amount of kernel and intermediate data incur a memory wall problem in resource-limited edge devices. The recent advances of a bin...
详细信息
ISBN:
(纸本)9781728123509
Deep neuralnetworks (DNNs) are widely used for real-world applications. However, large amount of kernel and intermediate data incur a memory wall problem in resource-limited edge devices. The recent advances of a binary deep neuralnetwork (BNN) and a computing in-memory (CIM) have effectively alleviated this bottleneck especially when they are combined together. However, previous CIM-based accelerators for BNN are highly vulnerable to process/supply voltage/temperature (PVT) variation, resulting in severe accuracy degradation which makes them impractical to be employed in real-world edge devices. To address this vulnerability, we propose a PVT-robust accelerator architecture for BNN with a computable 4T embedded DRAM (eDRAM) cell array. First, we implement the XNOR operation of BNN in a time-multiplexed manner by utilizing the fundamental read operation of the conventional eDRAM cell. Next, a PVT-robust bit-count based on charge sharing is proposed with a computable 4T eDRAM cell array. In result, the proposed architecture achieves 6.9x less variation in PVT-variant environments which guarantees a stable accuracy and 2.03-49.4x improvement of energy efficiency over previous CIM-based accelerators.
The performance gap between the processors and the main memory is continuously widening, known as the memory wall bottleneck. Emerging nonvolatile devices have the ability of in-memory processing, and thus, have the p...
详细信息
ISBN:
(纸本)9781538666487
The performance gap between the processors and the main memory is continuously widening, known as the memory wall bottleneck. Emerging nonvolatile devices have the ability of in-memory processing, and thus, have the potential to partially alleviate the memory wall bottleneck. People have adopted nonvolatile devices to build various accelerators that are targeted at different problems and applications. In this work, we adopt one of the emerging nonvolatile devices, the ferroelectric field-effect transistor (FeFET), to build a multi-functional in-memory processing unit, which is named FeMAT. From a structural point of view, FeMAT is an FeFET-based memory array composed of 3T-based cells. From a functional point of view, FeMAT not only is a nonvolatile memory, but also can perform some logic operations (i.e., the processing-in-memory (PIM) mode), binary convolutions (i.e., the binary convolutional neural network (BCNN) acceleration mode) and content searching (i.e., the ternary content-addressable memory (TCAM) mode) in the memory. These functions are seamlessly fused into the FeFET-based memory array and can be configured online without changing the circuit structure. Superior energy efficiency is demonstrated by our experiments and comparisons with a resistive random-access memory (ReRAM) based equivalence, as well as a TCAM and a BCNN accelerator based on complementary metal-oxide-semiconductor (CMOS) devices.
暂无评论