An adaptive transform coding algorithm using a quadtree-based variable blocksize DCT (discrete cosine transform) is introduced to achieve a better tradeoff between bit rate and image quality. The choice of appropriate...
详细信息
An adaptive transform coding algorithm using a quadtree-based variable blocksize DCT (discrete cosine transform) is introduced to achieve a better tradeoff between bit rate and image quality. The choice of appropriate blocksize is determined by a mean-based decision rule that can discriminate various image contents for better visual quality. Some simulation results are given. It is found that the same or better image quality can be obtained with lower average bit rate.< >
An open problem in source coding theory has been whether the Karhunen-Loeve transform (KLT) is optimal for a system that orthogonally transforms a vector source, scalar quantizes the components of the transformed vect...
详细信息
An open problem in source coding theory has been whether the Karhunen-Loeve transform (KLT) is optimal for a system that orthogonally transforms a vector source, scalar quantizes the components of the transformed vector using optimal bit allocation, and then inverse transforms the vector. Huang and Schultheiss (1963) proved that for a Gaussian source the KLT is mean squared optimal in the limit of high quantizer resolution. It is often assumed and stated in the literature that the KLT is also optimal in general for nonGaussian sources. We disprove such assertions by demonstrating that the KLT is not optimal for certain nearly bimodal Gaussian and uniform sources. In addition, we show the unusual result that for vector sources with independent identically distributed Laplacian components, the distortion resulting from scalar quantizing the components can be reduced by including an orthogonal transform that adds intercomponent dependency.
The Enhanced Compression Model (ECM) serves as the software foundation for future video coding exploration, extending beyond the capabilities of the current Versatile Video coding (VVC) standard. This paper conducts s...
详细信息
ISBN:
(数字)9798350389838
ISBN:
(纸本)9798350389845
The Enhanced Compression Model (ECM) serves as the software foundation for future video coding exploration, extending beyond the capabilities of the current Versatile Video coding (VVC) standard. This paper conducts statistical analyses on ECM encoded videos, focusing particularly on 1D and 2D transformation types, as well as intra and inter prediction modes across videos from different classes with distinct resolutions. These analyses are performed at the decoder level, where the coding decisions have already been made by the encoder. Results reveal that the selection of transformation type and size, as well as prediction mode (intra or inter), depend on video characteristics such as motion and texture. This study represents a significant advancement in the development of intelligent algorithms based on video characteristics to expedite decision-making in the ECM encoding process.
Analysis-by-synthesis (AbS) has been used in a range of medium and low bitrate speech coders to find the parameters that would best reconstruct a given waveform. AbS has been used despite its high computational comple...
详细信息
Analysis-by-synthesis (AbS) has been used in a range of medium and low bitrate speech coders to find the parameters that would best reconstruct a given waveform. AbS has been used despite its high computational complexity because it gives accurate results. In this paper, a low complexity analysis by synthesis scheme for a sinusoidal transform coder (STC) has been proposed. The accuracy of results has not been compromised, while reducing the computational complexity from 2.63 MFLOPS (best case) and 22.13 MFLOPS (worst case) to a constant value of 0.46 MFLOPS.
Moving Picture Experts Group-4 part-10 advanced video coding /H.264 standard uses rate-distortion optimization (RDO) as a measure to achieve a more satisfactory trade-off between bitrate and image quality. Adaptive tr...
详细信息
ISBN:
(纸本)9781509034963
Moving Picture Experts Group-4 part-10 advanced video coding /H.264 standard uses rate-distortion optimization (RDO) as a measure to achieve a more satisfactory trade-off between bitrate and image quality. Adaptive transform coding (ATC) can further provide superior RDO results (RDO costs) to those of fixed transform block-size coding. This paper presents a predictive method to skip 4 × 4 block transform coding to reduce the computational cost when it determines a minimal RDO cost of motion estimation and ATC. The experimental results show that the proposed method can reduce the computation by approximately 4.39%-48.51% with -0.003561-dB display distortion and 1.08% bitrate increment.
In this paper, we propose a new coding scheme for image compression using classified two-channel conjugate vector quantization (TC-CVQ) of the wavelet coefficients. This scheme exploits residual correlation among diff...
详细信息
In this paper, we propose a new coding scheme for image compression using classified two-channel conjugate vector quantization (TC-CVQ) of the wavelet coefficients. This scheme exploits residual correlation among different layers of the discrete wavelet transform (DWT) domain and improves the encoding efficiency. In addition, two-channel conjugate VQ requires less computational complexity and less storage (memory). Simulation results show that the reconstructed images preserve fine and pleasant qualities based on both subjective and mean square error criteria at a bit rate of 0.3 bit/pel(bpp).< >
The paper presents a novel method for the low bit-rate compression of a feature vector stream with particular application to distributed speech recognition. The scheme operates by grouping feature vectors into non-ove...
详细信息
The paper presents a novel method for the low bit-rate compression of a feature vector stream with particular application to distributed speech recognition. The scheme operates by grouping feature vectors into non-overlapping blocks and applying a transformation to give a more compact matrix representation. Both Karhunen-Loeve and discrete cosine transforms are considered. Following transformation, higher-order columns of the matrix can be removed without loss in recognition performance. The number of bits allocated to the remaining elements in the matrix is determined automatically using a measure of their relative information content. Analysis of the amplitude distribution of the elements indicates that non-linear quantisation is more appropriate than linear quantisation. Comparative results, based on both spectral distortion and speech recognition accuracy, confirm this. Speech recognition tests using the ETSI Aurora database demonstrate that compression to bits rates of 2400 bps, 1200 bps and 800 bps has very little effect on recognition accuracy. For example at a bit rate of 1200 bps, recognition accuracy is 98.0% compared to 98.6% with no compression.
This paper proposed a novel extension of bus-invert coding to handle 4-level pulse amplitude modulated (PAM-4) signals. A generalized mathematical model for energy consumption and energy dissipation for PAM-4 signals ...
详细信息
This paper proposed a novel extension of bus-invert coding to handle 4-level pulse amplitude modulated (PAM-4) signals. A generalized mathematical model for energy consumption and energy dissipation for PAM-4 signals is presented and a family of coding schemes are developed that can reduce the average power consumption and dissipation by up to 54% compared to un-coded PAM-4 buses. This technique is attractive for high-speed data transmission systems that employ PAM-4 signals on general high capacitance buses such as global wires, off-chip buses, I/O and backplanes
Multi-stage tree-structured vector quantization (MSTVQ) is examined as an alternative to entropy constrained scalar quantization (ECSQ) in transform coding of high fidelity audio signals with a simultaneous masking mo...
详细信息
Multi-stage tree-structured vector quantization (MSTVQ) is examined as an alternative to entropy constrained scalar quantization (ECSQ) in transform coding of high fidelity audio signals with a simultaneous masking model for distortion control. Discrete-cosine-transform coefficients are normalized by an interpolated spectral power envelope and groups of adjacent coefficients are vector coded with variable rate to achieve distortion-masking. With the current coder configuration, high fidelity quality for a sampling rate of 32 kHz is achievable with data rates below 64 kbps for some transform and masking model, preliminary results show that MSTVO and ECSQ have a similar rate-distortion performance.< >
We present a low bit-rate video compression system that integrates region-based coding with a spatio-temporal wavelet transform. The proposed system is designed for monitoring and video-phone applications. It distingu...
详细信息
We present a low bit-rate video compression system that integrates region-based coding with a spatio-temporal wavelet transform. The proposed system is designed for monitoring and video-phone applications. It distinguishes between a moving foreground and a static background, but image segmentation might also be based on other sources. The regions are encoded in separate layers using a chroma-keying technique that allows a controlled lossy recovery of the boundaries. A 3D-wavelet transform is applied to a group of frames of the predictor residual signal. Statistical dependencies of the transform coefficients extracted from different image subbands are captured by conditional probability models. Without the layered coding, the system has a superior performance compared to the H.263 standard for very low bit-rate coding. The layered coding causes a small degradation in visual quality at the same bit-rate.
暂无评论