Smart devices for image/video sensing are needed to work within the constraints of limited bandwidth and low computing capabilities. In this context, Block based Compressive Sensing (BCS) emerged as the most viable me...
详细信息
Smart devices for image/video sensing are needed to work within the constraints of limited bandwidth and low computing capabilities. In this context, Block based Compressive Sensing (BCS) emerged as the most viable method for balancing image/video quality and transmission bandwidth computing overheads. However, in comparison with conventional image and video acquisition systems, BCS cannot reduce the bitrate due to its straightforward nature of system of linear equations, which still incurs high transmission and storage overhead. To address this shortcoming, in this brief we propose a novel Near Lossless predictive coding (NLPC) approach to compress BCS measurements. The NLPC method encodes the prediction error measurement between the target and current measurement, resulting in lower data size. We designed and implemented a complete BCS integrated with NLPC with scalar quantization (BCS-NLPC-SQ) and studied the image quality at different compression ratios with varying block sizes. The BCS-NLPC-SQ method can improve roughly on an average PSNR of +3.06 dB and the average SSIM gain of +0.11 with respect to the existing works. The synthesis results shows that, BCS-NLPC-SQ requires 83.01%, 69.03%, 53.26%, and 14.45% less area, power, ADP and PDP over JPEG compression and we have achieved an additional compression of up to 56.25% in the best case. Our proposed BCS-NLPC-SQ method outperformed the existing methods in terms of PSNR, SSIM, and bpp.
At this moment different speech coders at low bitrates are being defined and normalized. These speech coders are all based on an analysis by synthesis procedure. Different methods have been proposed in recent litterat...
详细信息
At this moment different speech coders at low bitrates are being defined and normalized. These speech coders are all based on an analysis by synthesis procedure. Different methods have been proposed in recent litterature to construct the excitation signal. The goal of this article is to present in a unified formalism these different algorithms and to estimate their costs in terms of complexity. In a first step the basic principle of these speech coders is described. It is shown that a generalized description of the excitation covers different classical coding techniques and results in a most general iterative standard algorithm. The complexity of this algorithm is evaluated in terms of the number of multiplications/accumulations per second and different possible simplifications are analyzed. This article concludes with a presentation of the coders that are actually submitted for normalization.
It was recently shown that the symmetric multiple-description (MD) quadratic rate-distortion function for memoryless Gaussian sources and two descriptions can be achieved by dithered delta-sigma quantization combined ...
详细信息
It was recently shown that the symmetric multiple-description (MD) quadratic rate-distortion function for memoryless Gaussian sources and two descriptions can be achieved by dithered delta-sigma quantization combined with memoryless entropy coding. In this paper, we generalize this result to stationary (colored) Gaussian sources by combining noise shaping and source prediction. We first propose a new representation for the test channel that realizes the MD rate-distortion function of a Gaussian source, both in the white and in the colored source case. We then show that this test channel can be materialized by embedding two source prediction loops, one for each description, within a common noise shaping loop. While the noise shaping loop controls the tradeoff between the side and the central distortions, the role of prediction (like in differential pulse code modulation) is to extract the source innovations from the reconstruction at each of the side decoders, and thus reduce the coding rate. Finally, we show that this scheme achieves the MD rate-distortion function at all resolutions and all side-to-central distortion ratios, in the limit of high dimensional quantization.
This paper addresses the limitations of generative face video compression (GFVC) under conditions of substantial head movement and complex facial deformations. Previous GFVC frameworks focused on perceptual compressio...
详细信息
ISBN:
(数字)9798350391145
ISBN:
(纸本)9798350391152
This paper addresses the limitations of generative face video compression (GFVC) under conditions of substantial head movement and complex facial deformations. Previous GFVC frameworks focused on perceptual compression and reconstruct videos only with the goal of perceptual quality. As a result, they often have a large disparity relative to conventional codecs when evaluated for pixel fidelity. We propose a robust framework for learned predictive coding process aiming for both perceptual quality and improved performance in terms of pixel fidelity under low bitrate conditions. Our method proposes a dual residual learning strategy. Specifically, it learns the frame residual between the animated frame and the ground truth i.e. spatial residual coding and further exploits redundancies between neighboring frame residuals i.e temporal residual coding. We specially formulate a low bitrate conditional residual coding mechanisms for both spatial and temporal residual coding. In addition, we propose a zero-cost residual alignment mechanism to refine prediction accuracy of frame residuals. Through end-to-end optimization, the proposed framework achieves a balance between perceptual quality, pixel fidelity and compression efficiency. We conduct experimental evaluations on test sequences and conditions proposed under the JVET-AH0114 standard to show significant performance gains relative to HEVC and VVC standards in terms of perceptual metrics. Compared to other GFVC frameworks, our proposed framework achieves state of the art performance on perceptual metrics and pixel fidelity metrics. It is also competitive with HDAC, HEVC and VVC in terms of pixel fidelity at low bitrates. Source Code is publicly available at https://***/Goluck-Konuko/animation-based-codecs
A novel coding approach, applying open-loop coding principles in predictive coding systems is proposed in this paper. The proposed approach is instantiated with an intra-frame video codec employing the transform and s...
详细信息
A novel coding approach, applying open-loop coding principles in predictive coding systems is proposed in this paper. The proposed approach is instantiated with an intra-frame video codec employing the transform and spatial prediction modes from H.264. Additionally, a novel rate-distortion model for open-loop predictive coding is proposed and experimentally validated. Optimally allocating rate based on the proposed model provides significant gains in comparison to a straightforward rate allocation not accounting for drift. Furthermore, the proposed open-loop predictive codec provides gains of up to 2.3 dB in comparison to an equivalent closed-loop intra-frame video codec employing the transform, prediction modes and rate-allocation from H.264. This indicates that, with appropriate drift compensation, open-loop predictive coding offers the possibility for further improving the compression performance in predictive coding systems.
A new class of predictive coding algorithms, based on the quad-tree image representation, is described. Data compression schemes based on these algorithms have been found to produce acceptable images (peak-rms SNR >...
详细信息
A new class of predictive coding algorithms, based on the quad-tree image representation, is described. Data compression schemes based on these algorithms have been found to produce acceptable images (peak-rms SNR > 30dB) at rates as low as 0.25 bit/pixel.
In this paper we present a method for multi-lead ECG signal compression using predictive coding combined with Set Partitioning In Hierarchical Trees (SPIHT).We utilize linear prediction between the beats to exploit th...
详细信息
ISBN:
(纸本)9781509003648
In this paper we present a method for multi-lead ECG signal compression using predictive coding combined with Set Partitioning In Hierarchical Trees (SPIHT).We utilize linear prediction between the beats to exploit the high correlation among those beats. This method can optimize the redundancy between adjacent samples and adjacent beats. predictive coding is the next step after beat reordering step. The purpose of using predictive coding is to minimize amplitude variance of 2D ECG array so the compression error can be minimize . The experiments from selected records from MIT-BIH arrhythmia database shows that the proposed method is more efficient for ECG signal compression compared with original SPIHT and relatively have lower distortion with the same compression ratios compared to the other wavelet transformation techniques.
In this letter, we provide a theoretical analysis of optimal predictive transform coding based on the Gaussian Markov random field (GMRF) model. It is shown that the eigen-analysis of the precision matrix of the GMRF ...
详细信息
In this letter, we provide a theoretical analysis of optimal predictive transform coding based on the Gaussian Markov random field (GMRF) model. It is shown that the eigen-analysis of the precision matrix of the GMRF model is optimal in decorrelating the signal. The resulting graph transform degenerates to the well-known 2-D discrete cosine transform (DCT) for a particular 2-D first order GMRF, although it is not a unique optimal solution. Furthermore, we present an optimal scheme to perform predictive transform coding based on conditional probabilities of a GMRF model. Such an analysis can be applied to both motion prediction and intra-frame predictive coding, and may lead to improvements in coding efficiency in the future.
Tasks such as fault diagnosis in the process industry are facing with challenging problems, because massive data of process industry are not only nonlinear and strong related but also are unlabeled. Multivariate stati...
详细信息
ISBN:
(纸本)9781665401166
Tasks such as fault diagnosis in the process industry are facing with challenging problems, because massive data of process industry are not only nonlinear and strong related but also are unlabeled. Multivariate statistical analysis and machine learning methods are usually used to solve these problems. In view of the above characteristics of industrial data, we propose an unsupervised representation learning on massive data through contrastive predictive coding (CPC), and then apply the representation information to fault diagnosis task. Compared with other representation learning methods, CPC makes the representation more specific to the data by maximizing the mutual information between the representation and the original data. It can extract important information of data without labels better. The experiments on the Tennessee Eastman process show that this strategy can increase the separability of data and improve the performance of fault diagnosis task.
暂无评论