A tree code, asymptotically optimal for stationary Gaussian sources and squared error distortion [1], is applied suboptimally to encode the discrete cosine transform (DCT) of image subblocks. The variance spectrum of ...
详细信息
A tree code, asymptotically optimal for stationary Gaussian sources and squared error distortion [1], is applied suboptimally to encode the discrete cosine transform (DCT) of image subblocks. The variance spectrum of each block DCT is estimated and specified uniquely by a set of one-dimensional autoregressive parameters. The average pel rate for each block is allowed to vary to meet the specification of the same average distortion per block. Since the variance spectrum and rate are different for every block, so is the code tree. First, a level of search intensity is determined through simulation where tree coding outperforms in mean squared error a comparable coding simulation with quantization replacing tree coding. Then, a secondary coding step (postcoding) is introduced to encode, with insignificant additional rate, the large isolated errors which may result from the tree search. Comparative coding simulations with a 256 x 256 and 512 x 512 image show that DCT tree coding with postcoding is clearly superior to DCT quantization and that a variable block rate assignment gains about 3 dB over a fixed block rate assignment. The mean squared error results achieved for the tree coding system are comparable to or surpass all previous results reported for the same images.
A method is presented to automatically inspect the block boundaries of a reconstructed two-dimensional transform coded image, to locate blocks which are most likely to contain errors, to approximate the size and type ...
详细信息
A method is presented to automatically inspect the block boundaries of a reconstructed two-dimensional transform coded image, to locate blocks which are most likely to contain errors, to approximate the size and type of error in the block, and to eliminate this estimated error from the picture. This method uses redundancy in the source data to provide channel error correction. No additional channel error protection bits or changes to the transmitter are required. It can be used when channel errors are unexpected prior to reception.
This paper discusses problems of adaptive transform coding schemes at bit rates of 12 kbit/s and below. Objective and subjective performance reductions, like low-pass filtering effects as one of the main sources of pe...
详细信息
This paper discusses problems of adaptive transform coding schemes at bit rates of 12 kbit/s and below. Objective and subjective performance reductions, like low-pass filtering effects as one of the main sources of perceptual distortion, are investigated and proposals are made how to improve the performance of the coder at low and medium bit rates. Additionally, the needed transmission of side information reduces the efficiency of the scheme. Various methods to lower the rate of this supplementary data signal are given as well as modifications of the scheme which lead to a more easily implemented coder structure.
In this study an approach for improving the performance of waveform coders, based on coding a frequency scaled speech signal, is examined and subjectively evaluated for specific subband and transform coding systems. T...
详细信息
In this study an approach for improving the performance of waveform coders, based on coding a frequency scaled speech signal, is examined and subjectively evaluated for specific subband and transform coding systems. The recently developed simple and efficient time-domain harmonic scaling (TDHS) algorithms are used to frequency scale the speech signal. The underlying frequency-domain model of the pitch-adaptive TDHS algorithms provides insight and guidelines for their use in this application, as outlined in this work. The subjective evaluation is based on an A-B comparison test involving 12 listeners and shows a meaningful improvement in quality for the waveform coders used at low bit rates. In particular, subband coding (SBC) combined with TDHS (SBC/HS) at 9.6 kbits/s was found to provide a quality equivalent to that of SBC alone at 16 kbits/s, i.e., a bit-rate advantage of about 7 kbits/s was realized. For the speech specific adaptive transform coder (ATC) used, the combined system (ATC/HS) achieves a bit-rate advantage of 4 kbits/s at 7.2 kbits/s. The SBC/HS system emerges as a particularly attractive method for speech encoding at the data rate of 9.6 kbits/s since its quality is comparable to that of ATC/HS (or SBC at 16 kbits/s). Yet, its complexity is lower than ATC and the system is amenable to real-time hardware implementation using current technology.
HEVC is the new coding standard providing higher compression ratios with comparated to the other existent coding standards. In this study, the image compression performance using intra prediction is examined. Especial...
详细信息
ISBN:
(纸本)9781538615010
HEVC is the new coding standard providing higher compression ratios with comparated to the other existent coding standards. In this study, the image compression performance using intra prediction is examined. Especially, the coding of transformation of prediction errors are handled and three different coding structure is proposed. The compression code efficiency is obtained for different block sizes and image sizes by depending on the transform coding formats. The original images are recovered by decoding codes of the compressed images, and the similarity with the original images is inspected by consideirng the PSNR values.
This paper proposes a new method for pre-echo reduction in transform-based audio coding by controlling the temporal envelope of the waveform. The proposed method comprises two operating modes: temporal envelope flatte...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
This paper proposes a new method for pre-echo reduction in transform-based audio coding by controlling the temporal envelope of the waveform. The proposed method comprises two operating modes: temporal envelope flattening and temporal envelope correction of a target signal. The proposed method estimates signal levels with a low temporal resolution from side information using machine learning and converts them into a signal to be applied to the target signal to flatten and correct the temporal envelope. It also adjusts the signals to maintain signal continuity between the non-transient and transient frames. The proposed method differs from conventional methods in that it directly modifies the waveform before encoding and after decoding, which makes it useful as a new coding tool for legacy codecs. A subjective performance evaluation confirms that the proposed method uses fewer bits to provide sound quality equivalent to that of the short-window transform.
作者:
Lin, JianyuShantou Univ
Dept Elect Engn Guangdong Prov Key Lab Digital Signal & Image Pro Shantou Guangdong Peoples R China
A new 3D transform video coding algorithm is introduced, which does not use motion compensation. More independent and highly efficient algorithms are employed for each key step of the transform coding. For the transfo...
详细信息
ISBN:
(纸本)9781538611074
A new 3D transform video coding algorithm is introduced, which does not use motion compensation. More independent and highly efficient algorithms are employed for each key step of the transform coding. For the transform step, the SCWP (Spectral Condensed Wavelet Packet) is adopted. For the quantization step, a trimming process is implemented, which keeps the bit allocation function of the significant propagation technique but is independent from the entropy coding step. For the entropy coding step, a novel entropy coding technique based upon binary run-length coding is proposed. This binary entropy coding can be applied to multiple symbol source coding, and it approaches an optimal efficiency bound which is within 1.5% of the source entropy when the source approaches iid. Principally, the complexity of the proposed transform video coding algorithm is comparable to that of a 2D still image transform coding algorithm. However, its compression performance is competitive to HEVC at high compression bitrates.
In this paper we present a new model-based method to code the transform coefficients of audio signals. The histogram of transform coefficients is approximated by a generalized Gaussian model for efficient model-based ...
详细信息
ISBN:
(纸本)1424407281
In this paper we present a new model-based method to code the transform coefficients of audio signals. The histogram of transform coefficients is approximated by a generalized Gaussian model for efficient model-based bit allocation and the spectrum is coded by scalar quantization followed by arithmetic coding. An example coder operating at 16 kHz and using predictive modified discrete cosine transform (MDCT) coding is described. We compare the performance of the proposed coder with ITU-T G.722.1. Objective and subjective quality results are presented. The proposed coder is better than ITU-T G.722.1 at 24 kbit/s and equivalent at 32 kbit/s.
transform coding is widely used in the video and image codec to largely remove the spatial correlation. The magnitude of transform coefficient is weakly correlated to a number of factors, including its frequency band,...
详细信息
ISBN:
(纸本)9781509021758
transform coding is widely used in the video and image codec to largely remove the spatial correlation. The magnitude of transform coefficient is weakly correlated to a number of factors, including its frequency band, the neighboring coefficient magnitudes, luma/chroma planes, etc. To exploit such correlations for efficient entropy coding, one would build a probability model conditioned on the available contexts. However, the interaction of these factors creates a high dimensional space, a direct use of which would easily fall into the over-fitting problem. How to construct a compact context set which effectively captures the underlying correlations remains a major challenge in video and image compression. Prior research work primarily relies on bucketizing the previously coded coefficients into a small number of categories as the context model for next coefficient. Certain information loss is inevitable due to the classification process. To fully exploit the available context in a limited model space, a level map approach is proposed in this work. It decomposes the coding of coefficient magnitudes into consecutive runs of binary map coding, each corresponds to whether a coefficient is equal to or greater than the given level. Under the Markov assumption across the levels, nearly all the reference symbols available to each level map can be approximated as binary random variables. It hence allows the context model to account for all the surrounding coefficients information provided by the lower level maps, while retaining a reasonably compact size. Experimental evidence demonstrates that the proposed coding scheme provides considerable compression performance gains consistently over a large test settings.
In this paper, we present the One-Dimensional Directional Unified transform (1DDUT) for intra coding. Intra prediction in H.264/AVC has eight different directional prediction modes and a DC prediction mode. Since the ...
详细信息
ISBN:
(纸本)9781457713033
In this paper, we present the One-Dimensional Directional Unified transform (1DDUT) for intra coding. Intra prediction in H.264/AVC has eight different directional prediction modes and a DC prediction mode. Since the statistical characteristics of prediction residuals are different in each prediction mode, coding efficiency can be improved by applying the different optimal transform for each prediction mode. However, in order to avoid the increase of the complexity, unification of the transforms is desired. Therefore we classify the characteristics of the prediction residuals along the vertical or the horizontal direction, where each direction has two classes and their characteristics are similar to each other between directions. It is sufficient to use two kinds of 1-D transforms that are designed based on the unified characteristics. One is the well-known Discrete Cosine transform, and the other is a predetermined 1-D transform based on the Karhunen-Loeve transform method. 1DDUT switches DCT and the predetermined 1-D transform according to the intra prediction modes. 1DDUT can achieve 8.71% bitrate reduction on average with negligible complexity increase compared to the DCT-like transform used in H.264/AVC.
暂无评论