In recent years, the demand for data storage space has increased dramatically due to the exponential growth of data volume. datacompression is of great significance since it saves data storage space and reduces data ...
详细信息
ISBN:
(数字)9781665453363
ISBN:
(纸本)9781665453363
In recent years, the demand for data storage space has increased dramatically due to the exponential growth of data volume. datacompression is of great significance since it saves data storage space and reduces data transfer demand. compression algorithms based on statistical models have a much higher compression ratio than dictionary-based methods, but the high computational time cost of statistical modeling limits their wider application. In this paper, we introduce pLPAQ, an FPGA-based design of a powerful compression algorithm LPAQ based on statistical models. A novel hardware accelerator is proposed to speed up LPAQ by fully utilizing the parallelism of FPGA. Experimental results show that the proposed design can achieve a throughput of 12 MB/s on Xilinx Virtex Plus UltraScale XCVU9P card, 25x faster than executing on AMD Ryzen R7 4800U at 2.8 GHz and 80x faster compared with the naive FPGA implementation on average.
To meet the real-time processing demands of largescale e-commerce transaction data, a distributed data processing system was designed and implemented. The system utilizes an improved LZ4 compression algorithm and micr...
详细信息
Coherent accumulation is an important technique to improve the detection capability of pulsed radar altimeter targets when range-walk and velocity ambiguity do not occur. This paper focuses on the pulse compression te...
详细信息
Pruning deep neural networks (DNN) is a wellknown technique that allows for a sensible reduction in inference cost. However, this may severely degrade the accuracy achieved by the model unless the latter is properly f...
详细信息
ISBN:
(纸本)9798350309492;9798350309485
Pruning deep neural networks (DNN) is a wellknown technique that allows for a sensible reduction in inference cost. However, this may severely degrade the accuracy achieved by the model unless the latter is properly fine-tuned, which may, in turn, result in increased computational cost and latency. Thus, upon deploying a DNN in resource-constrained edge environments, it is critical to find the best trade-off between accuracy (hence, model complexity) and latency and energy consumption. In this work, we explore the different options for the deployment of a machine learning pipeline, encompassing pruning, finetuning, and inference, across a mobile device requesting inference tasks and an edge server, and considering privacy constraints on the data to be used for fine-tuning. Our experimental analysis provides insights for an efficient allocation of the pipeline tasks across network edge and mobile device in terms of energy and network costs, as the target inference latency and accuracy vary. In particular, our results highlight that the higher the edge server load and the number of inference requests, the more convenient it becomes to deploy the entire pipeline at the mobile device using a pruned model, with a cost reduction of up to a factor two compared to deploying the whole pipeline at the edge.
Neural recordings frequently get contaminated by ECG or pulsation artifacts. These large amplitude components can mask the neural patterns of interest and make the visual inspection process difficult. The current stud...
详细信息
ISBN:
(纸本)9798350324471
Neural recordings frequently get contaminated by ECG or pulsation artifacts. These large amplitude components can mask the neural patterns of interest and make the visual inspection process difficult. The current study describes a sparse signal representation strategy that targets to denoise pulsation artifacts in local field potentials (LFPs) recorded intraoperatively. To estimate the morphology of the artifact, we first detect the QRS-peaks from the simultaneously recorded ECG trace as an anchor point. After the LFP data has been epoched with respect to each beat, a pool of raw data segments of a specific length is generated. Using the K-singular value decomposition (K-SVD) algorithm, we constructed a data-specific dictionary to represent each contaminated LFP epoch in a sparse fashion. Since LFP is aligned to each QRS complex and the background neural activity is uncorrelated to the anchor points, we assumed that constructed dictionary will be formed to mainly represent the pulsation artifact. In this scheme, we performed an orthogonal matching pursuit to represent each LFP epoch as a linear combination of the dictionary atoms. The denoised LFP data is thus obtained by calculating the residual between the raw LFP and its approximation. We discuss and demonstrate the improvements in denoised data and compare the results with respect to principal component analysis (PCA). We noted that there is a comparable change in the signal for visual inspection to observe various oscillating patterns in the alpha and beta bands. We also see a noticeable compression of signal strength in the lower frequency band (<13 Hz), which was masked by the pulsation artifact, and a strong increase in the signal-to-noise ratio (SNR) in the denoised data.
Image compression is an important issue in image information processing domain. The size of the image data is a crucial factor while processing image. It can affect the architecture of the design or memory size requir...
详细信息
Convolutional Neural Networks (CNN) are popular models widely used in image classification, target recognition, and other fields. FPGA-based accelerators for CNN are a standard method in recent years to reduce CNN'...
详细信息
ISBN:
(数字)9781728186719
ISBN:
(纸本)9781728186719
Convolutional Neural Networks (CNN) are popular models widely used in image classification, target recognition, and other fields. FPGA-based accelerators for CNN are a standard method in recent years to reduce CNN's inference time and energy efficiency. However, the limitations of on-chip storage space and computing resources introduce deep compression. Contrary to most compression algorithms that pay no attention to the underlying hardware acceleration strategy and hardware-only accelerators, this paper introduces a novel model compression scheme with software and hardware collaboration for accelerating inference. First, we propose a pre-processing algorithm named Simon k-means based on clustering to quantify trained weight to speed up inference. Next, we propose a new encoding method for the quantized weight, significantly reducing the model's storage size. Finally, we give the architecture design of the accelerator using the quantized weight to accelerate the convolution. We have evaluated many popular CNNs in image classification tasks on various data sets. Experiments show that the number of multiply-accumulate operations on the convolutional layer can be reduced 66.6% with a slight loss of precision.
The latest multi-view video coding standard, MV-HEVC, efficiently represents multi-view video sequences, reducing the required network bitrate, and incorporates error resilience tools to maintain video quality even un...
详细信息
In order to remove the large amount of redundant sensor timing data existing in industrial systems, improve the storage efficiency of storage media and the query efficiency of data, this paper proposes an AQ-SDT algor...
详细信息
This paper provides different perspectives on the Huffman and Lempel Ziv algorithms performance in terms of lossless compression. This performance will be tested using 16-bit audio data, which in related research is s...
详细信息
暂无评论