Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory access...
详细信息
ISBN:
(纸本)9781538604465
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.
fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision w...
详细信息
ISBN:
(纸本)9781509041176
fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining. We propose an improved fixed-point optimization algorithm that estimates the quantization step size dynamically during the retraining. In addition, a gradual quantization scheme is also tested, which sequentially applies fixed-point optimizations from high-to low-precision. The experiments are conducted for feed-forward deep neural networks (FFDNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs).
In today's rapidly advancing technological landscape, the applications of deep learning permeate various facets of our lives. However, traditional implementations of convolutional neural networks (CNNs) on platfor...
详细信息
ISBN:
(纸本)9798350388350;9798350388343
In today's rapidly advancing technological landscape, the applications of deep learning permeate various facets of our lives. However, traditional implementations of convolutional neural networks (CNNs) on platforms such as CPUs and GPUs often require substantial network bandwidth and incur high power consumption. Deploying CNNs on Field-Programmable Gate Arrays (FPGAs) with efficient logic control from CPUs offers a promising solution for low-power and compact hardware designs. This paper proposes a novel approach to optimize YOLOv3-tiny on FPGA, aiming to reduce hardware resource consumption and power usage while enhancing the computational efficiency of the convolutional neural network. Through hardware optimization strategies, our solution demonstrates improved performance, making it well-suited for real-time deep learning inference tasks in resource-constrained environments.
fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision w...
详细信息
ISBN:
(纸本)9781509041183
fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining. We propose an improved fixed-point optimization algorithm that estimates the quantization step size dynamically during the retraining. In addition, a gradual quantization scheme is also tested, which sequentially applies fixed-point optimizations from high- to low-precision. The experiments are conducted for feed-forward deep neural networks (FFDNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs).
暂无评论