This paper presents a new DCT based still image coding scheme using block prediction. Unlike a traditional transform coder where subimage blocks of the image are independently coded, the proposed scheme uses a block m...
详细信息
This paper presents a new DCT based still image coding scheme using block prediction. Unlike a traditional transform coder where subimage blocks of the image are independently coded, the proposed scheme uses a block matching predictor which finds the best match (based on a error criterion) in a search window that consists of a previously coded image region. In this coding scheme, the block predictor is first used to find a prediction vector representing the relative location of the subimage in the search window having similar pixel values as the current subimage, then the differential error residual of the current and predicted subimages is compressed using DCT coding. The quantized DCT coefficients along with the prediction vectors are transmitted or stored for reconstruction by the decoder. This paper develops this coding technique and compares the performance with traditional DCT coding of images.< >
作者:
Z. HeH. LiDSP Division
Radio Engineering Department South-East University Nanjing China
A multilayer neural network used for nonlinear predictive image coding is described. Two coding schemes, nonadaptive and adaptive, are shown. Owing to matching of the local properties of the image, nonlinear predictiv...
详细信息
A multilayer neural network used for nonlinear predictive image coding is described. Two coding schemes, nonadaptive and adaptive, are shown. Owing to matching of the local properties of the image, nonlinear predictive coding gives a better performance than linear predictive coding. A series of computer experiments shows the method has not only the ability to generalize but also noise reduction capabilities. Compared with differential pulse code modulation (DPCM), it greatly reduces the number of bits to be transmitted.< >
This paper considers the problem of predictive fusion coding for storage of multiple spatio-temporally correlated sources so as to enable efficient selective retrieval of data from subsets of sources as designated by ...
详细信息
This paper considers the problem of predictive fusion coding for storage of multiple spatio-temporally correlated sources so as to enable efficient selective retrieval of data from subsets of sources as designated by future queries. Only statistical information about future queries is available during encoding. While temporal correlations can be exploited by coding over large blocks, the growth in encoding complexity renders this approach impractical and hence the interest in a low complexity predictive coding approach. However, the design of optimal predictive fusion coding systems is considerably complicated by the presence of the prediction loop, and the potentially exponential growth of the query sets. We propose a complexity-constrained predictive fusion coder and derive an iterative algorithm for its design, which is based on the "Asymptotic Closed Loop" framework and hence, circumvents convergence and stability issues of traditional predictive quantizer design. The proposed predictive fusion coder optimizes the distortion - retrieval rate tradeoff, given a fixed storage capacity, and provides significant gains over storage schemes that perform only joint compression or memoryless fusion coding of all sources.
In this paper, we evaluate the performance of Discrete Fourier Transform with small overlap in Transform-predictive-coding-based coders - e.g. Transform Coder eXcitation (TCX) and Transform predictive coding (TPC). Th...
详细信息
In this paper, we evaluate the performance of Discrete Fourier Transform with small overlap in Transform-predictive-coding-based coders - e.g. Transform Coder eXcitation (TCX) and Transform predictive coding (TPC). Three different time-frequency analysis techniques are compared : the Discrete Fourier Transform, the Modified Discrete Cosine Transform (MDCT) with 50% overlap and the Discrete Fourier Transform with 3% overlap. We show how a DFT with small overlap is more attractive, in a transform-predictive-coding based coder, than a simple DFT or a high frequency resolution MDCT.
Studies adaptive predictive coding of digitized images using novel two-dimensional multiplicative autoregressive (MAR) models. A general stability theorem for 2-D MAR image models with causal nonsymmetric half-plane (...
详细信息
Studies adaptive predictive coding of digitized images using novel two-dimensional multiplicative autoregressive (MAR) models. A general stability theorem for 2-D MAR image models with causal nonsymmetric half-plane (NSHP) supports is presented, and two different models are introduced. A major advantage of using 2-D MAR models, as opposed to general 2-D models, is the ability to guarantee predictor stability for NSHP support regions. The performance of 2-D MAR predictive coders is tested on five different images. The results of these studies demonstrate that, for coding of digitized images, 2-D MAR model-based predictive coders offer an effective alternative to currently available techniques.< >
A new image coder is presented in which an image is divided into blocks, each block is quadtree segmented, and each segment is coded using a form of predictive coding. We provide nearly-optimal segmentation and quanti...
详细信息
A new image coder is presented in which an image is divided into blocks, each block is quadtree segmented, and each segment is coded using a form of predictive coding. We provide nearly-optimal segmentation and quantization rules for this framework, as well as an iterative codebook design algorithm. In simulations, our new system comes within 1 dB of JPEG 2000 and outperforms JPEG by about 1.5 dB at moderate rates and up to 4.5 dB at low rates. Additionally, our method captures edges better and has potentially lower complexity than transform-based methods.
This paper presents a new method for efficient image coding, consisting of a cascade of the following processing stages: i) predictive ordering technique (POT);ii) feedback transform coding (FTC);iii) vertical subtrac...
详细信息
This paper presents a new method for efficient image coding, consisting of a cascade of the following processing stages: i) predictive ordering technique (POT);ii) feedback transform coding (FTC);iii) vertical subtraction of quantized coefficients (VSQC);iv) predictive coding refinements in the signal space consisting of either overshoot suppression (OS) as a first variant or hybrid block truncation coding (HBTC) as a second one. The POT algorithm uses the vertical correlation between adjacent pels to change the relative order of elements along a scan line, by putting them in decreasing order of amplitudes (taking as reference the previously received scan line);this ordering concentrates the signal energy into "low generalized frequency" regions. The FTC method is an iterative procedure for increasing with a given step the number of nonzero elements that belong to the orthogonally transformed picture vector, until the mean square error criterion is satisfied (the error representing the difference between the original image vector and the last reconstructed iteration). The VSQC computes the discrete differences between the quantized transform coefficients of the same order belonging to adjacent scan lines. The OS algorithm detects and eliminates the spatial reconstruction errors, whose absolute values exceed a given threshold. The HBTC uses a one-bit nonparametric quantization of the error vector representing the difference between the original picture vector and the last reconstructed vector (satisfying the FTC criterion), so that the first two sample moments are preserved. The reconstructed pictures are presented with their coding fidelity performances (mean square quantization error, mean absolute error and signal-to-noise ratio), using as test pictures a portrait and a LANDSAT image. Good quality images at low bit rate (0.55-1.1) bits/pixel have been obtained.
An algorithm for designing linear prediction-based two-channel multiple-description predictive-vector quantizers;(MD-PVQs) for packet-loss channels is presented. This algorithm iteratively improves the encoder partiti...
详细信息
An algorithm for designing linear prediction-based two-channel multiple-description predictive-vector quantizers;(MD-PVQs) for packet-loss channels is presented. This algorithm iteratively improves the encoder partition, the set of multiple description codebooks, and the linear predictor for a given channel loss probability, based on a training set of source data. The effectiveness of the designs obtained with the given algorithm is demonstrated using a waveform coding example involving a Markov source as well as vector quantization of speech line' spectral pairs.
Neural audio/speech coding has recently demonstrated its capability to deliver high quality at much lower bitrates than traditional methods. However, existing neural audio/speech codecs employ either acoustic features...
详细信息
Neural audio/speech coding has recently demonstrated its capability to deliver high quality at much lower bitrates than traditional methods. However, existing neural audio/speech codecs employ either acoustic features or learned blind features with a convolutional neural network for encoding, by which there are still temporal redundancies within encoded features. This article introduces latent-domain predictive coding into the VQ-VAE framework to fully remove such redundancies and proposes the TF-Codec for low-latency neural speech coding in an end-to-end manner. Specifically, the extracted features are encoded conditioned on a prediction from past quantized latent frames so that temporal correlations are further removed. Moreover, we introduce a learnable compression on the time-frequency input to adaptively adjust the attention paid to main frequencies and details at different bitrates. A differentiable vector quantization scheme based on distance-to-soft mapping and Gumbel-Softmax is proposed to better model the latent distributions with rate constraint. Subjective results on multilingual speech datasets show that, with low latency, the proposed TF-Codec at 1 kbps achieves significantly better quality than Opus at 9 kbps, and TF-Codec at 3 kbps outperforms both EVS at 9.6 kbps and Opus at 12 kbps. Numerous studies are conducted to demonstrate the effectiveness of these techniques.
The current study investigated possible human-robot kinaesthetic interactions using a variational recurrent neural network (RNN) model, called PV-RNN, which is based on the free-energy principle. Our prior robotic stu...
详细信息
The current study investigated possible human-robot kinaesthetic interactions using a variational recurrent neural network (RNN) model, called PV-RNN, which is based on the free-energy principle. Our prior robotic studies using PV-RNN showed that the nature of interactions between the top-down expectation and bottom-up inference is strongly affected by a parameter, called the meta-prior, which regulates the complexity term in free energy. The current study examines how changing the meta-prior w in the interaction phase affects the counter force generated when an experimenter attempts to induce movement pattern transitions familiar to the robot through its prior training. The study also compares the counter force generated when trained transitions are induced by a human experimenter and when untrained transitions are induced. Finally, the study examines how different levels of disturbances, pattern level, and cognitive level can be resolved internally across different layers. Our experimental results indicated that 1) the human experimenter needs more/less force to induce trained transitions when w is set with larger/smaller values;2) the human experimenter needs more force to act on the robot when he attempts to induce untrained as opposed to trained movement pattern transitions;and 3) when the robot was disturbed in the cognitive level, the disturbance was resolved more in the higher layer as compared to the case with disturbance in the pattern level. Our analysis of time development of essential variables and values in PV-RNN during bodily interaction clarified the mechanism by which gaps in action intentions between the human experimenter and the robot can be manifested as reaction forces between them.
暂无评论