Even though future frame prediction in videos is a relatively young unsupervised learning task, it has shown promise by accommodating the networks to effectively learn efficient internal representations in a visual hy...
详细信息
Even though future frame prediction in videos is a relatively young unsupervised learning task, it has shown promise by accommodating the networks to effectively learn efficient internal representations in a visual hyperspace. predictive coding Network (PredNet) uses future frame predictions as a learning signal and has a legacy background of unconscious inference, free energy, and predictive coding model of the visual cortex; it is still a relatively young network compared to RNNs, CNNs, and so on. Although Rao and Ballard’s proposed predictive coding (PC) model is aimed at reducing the redundancy within the learned internal representations by a network, and Lotter et al.’s design of the PredNet might not be the ideal replication of the PC model, it still shows promise for learning better less-redundant internal representations than other networks. In this paper, we augment PredNet to enhance its performance in future frame prediction. Additionally, we introduce a new measure known as the gradient difference error (GDE) measure based on the gradient difference loss (GDL) function proposed by Mathieu et al. We do this to adapt the GDL function to the context of PredNet since it uses an implicit loss function besides the explicit loss used during training. Our experimental results show that PredNet, when using a combination of the L1 loss function with GDE or GDL, is faster to converge to the best performance while trading off minimal quality of the predictions within a given training window. In doing so, we transform PredNet into Gradient Difference-PredNet (GD-PredNet), and we aim to encourage increased research in predictive coding and PredNet.
Summary form only given. A compression algorithm for high quality speech signal using predictive coding techniques is developed. Code-excited linear predictive coding (CELPC) is one of the key techniques to compress s...
详细信息
Summary form only given. A compression algorithm for high quality speech signal using predictive coding techniques is developed. Code-excited linear predictive coding (CELPC) is one of the key techniques to compress speech signal to a bit-rate around 4.8 Kbps. However, due to the heavy computational requirement in the CELPC and speech signals usually can be divided into two portions: namely the based-band and the high-band frequency range. A hybrid CELPC and voice excited linear predictive coding (VELPC) scheme is developed for speech coding to reduce the complexity of the original CELPC. In the algorithm, a speech signal is firstly divided into two portions, the based-band and high-band respectively, in frequency domain, and then the low portion is coded with CELPC and the high-band portion is coded with VELPC. The test experiments showed this new coder can produce synthesized speech with good quality at a better bit rates than the original CELPC. When using the coding methods for the base-band and the high-band signal, we must decide how to divide the speech signal into two portions. In choosing the bandwidth of the base-band signal, there is a trade-off between the coding quality and the bit rate. In our experiment, the bandwidth of the base-band signal is chosen as one fourth of that of the original speech. Subjective evaluation experiments were conducted to test the performance of the hybrid CELPC and VELPC technique. For speech signal sampled at 8 kHz, a bit rate of 4.0 kbps can be achieved with frame intervals of 23 ms. The experimental results showed that the quality of the synthesized speech using hybrid coding technique at the bit rate of 4.0 kbps was almost the same as that of the CELPC at the bit rate of 4.8 kbps.
In this paper we propose an efficient architecture for onboard implementation of rate-controlled predictive lossy compression of hyperspectral and multispectral images. In particular, we consider the recent state-of-t...
详细信息
ISBN:
(纸本)9781479957521
In this paper we propose an efficient architecture for onboard implementation of rate-controlled predictive lossy compression of hyperspectral and multispectral images. In particular, we consider the recent state-of-the-art rate control algorithm for onboard predictive compression [1], and propose an architecture addressing two fundamental aspects of its hardware implementation. Specifically, this architecture overcomes the serial nature of the algorithm, as well as the large memory requirements of the entropy coding stage, achieving a pipelined implementation suitable for high-throughput onboard implementation, at a negligible cost in terms of coding efficiency.
With the emergence of Light Field (LF) technology, the number of dimensions representing light has once again increased. 4D light fields captured with additional temporal information per ray or as assemblies of rays i...
详细信息
ISBN:
(数字)9781728157849
ISBN:
(纸本)9781728157856
With the emergence of Light Field (LF) technology, the number of dimensions representing light has once again increased. 4D light fields captured with additional temporal information per ray or as assemblies of rays include the 5 th dimension, namely time and thus produce 5D light fields. This is very crucial when we have moving objects in the scene. In the recent years, research has paved way to several ideas on efficient 4D light field compression. However, techniques for compression and storage for higher dimensions is still an open challenge. In this paper we have introduced a low-complexity predictive coding of 5D light fields by automatic generation of per frame customized coding structure exploiting both spatial and temporal neighbors. Evaluations with HEVC codec shows an increase of more than 1.4 dB gain in quality.
This paper presents a switched predictive coding method for lossless compression of video. In the proposed method, a set of switched predictors is found by a training process that uses only a small number of successiv...
详细信息
This paper presents a switched predictive coding method for lossless compression of video. In the proposed method, a set of switched predictors is found by a training process that uses only a small number of successive frames of a video and then the trained predictors are used with a large number of the frames of the video. To find the predictors, the pixels of the successive frames are first classified based on an estimate of activity level in their neighbouring pixels and then LS based feedback type of predictors are estimated for all the pixels belonging to each of the classes. We propose a total of 21 classes, which are obtained by combining the seven slope bins of gradient adjusted predictor (GAP) and three classified temporal contexts. After collecting the predictors for pixels belonging to each of the 21 classes, the best predictor, in terms of minimum zero-order entropy, is chosen to represent the various classes. Simulation results show that the application of the set of the predictors results in competitive performance with the LOPT - one of the best methods in terms of achievable compression ratio. Our method and LOPT has same order of coding complexity while our decoder is computationally very simple as against high complexity of LOPT based decoder.
We present a novel method for predictive coding with application to transmission of speech over packet-switched networks. Our method uses multiplexing to distribute a part of the information about a segment of each sp...
详细信息
We present a novel method for predictive coding with application to transmission of speech over packet-switched networks. Our method uses multiplexing to distribute a part of the information about a segment of each speech signal in several data packets while keeping the data packet rate and payload for that part of the information unchanged. We investigate three multiplexing schemes: a packet hopping, a Hadamard multiplexing, and an extension of the Hadamard multiplexing that exploits a nonlinear preprocessing and estimation method. We show by means of formal AB-preference tests that multiplexed predictive coding can lead to coders that are more robust to packet losses than scalar quantization and packet loss concealment according to the G.711 standard.
In this paper, the architecture of a digital pixel sensor (DPS) array with an online 1-bit predictive coding algorithm using Hilbert scanning scheme is proposed. The architecture of the sensor array reduces by more th...
详细信息
ISBN:
(纸本)9781424438273
In this paper, the architecture of a digital pixel sensor (DPS) array with an online 1-bit predictive coding algorithm using Hilbert scanning scheme is proposed. The architecture of the sensor array reduces by more than half the silicon area of the DPS by sampling and storing the differential values between the pixel and its prediction, featuring compressed dynamic range and hence requiring limited precision (only 1-bit signed value in the proposed architecture as compared to 8-bit unsigned full precision). Hilbert scanning is used to read-out the pixel's value, hence avoiding discontinuity in the read-out path, which is shown to improve the quality of the reconstructed image. The Hilbert scanning path is all carried out by hardware wire connection without increasing the circuit complexity of the sensor array. Reset pixels are inserted into scanning path to overcome the error accumulation problem inherent in predictive coding. System level simulation results show a PSNR of around 25dB can be reached while using the proposed 1-bit Hilbert predictive coding algorithm. VLSI implementation results illustrate a pixel level implementation featuring a pixel size reduction of 67% with a fill-factor of 40% compared with a standard PWM DPS architecture.
The properties of several multi-dimensional quantizers (VQ, tree and trellis coders) have been investigated. The multi-dimensional quantizers yield a superior distortion performance for direct quantization of stationa...
详细信息
The properties of several multi-dimensional quantizers (VQ, tree and trellis coders) have been investigated. The multi-dimensional quantizers yield a superior distortion performance for direct quantization of stationary sources. Incorporated in a predictive speech coding scheme, NFC, the use of multi-dimensional quantizers improves speech quality noticably. A subjective test indicate that NFC with trellis coding is a promising candidate for speech coding at 16 kbit/s.
This paper presents a novel wideband speech coding algorithm called transform predictive coding (TPC). The main emphasis is on low complexity. TPC uses short-term and long-term prediction to remove the redundancy in s...
详细信息
ISBN:
(纸本)0780331923
This paper presents a novel wideband speech coding algorithm called transform predictive coding (TPC). The main emphasis is on low complexity. TPC uses short-term and long-term prediction to remove the redundancy in speech. The prediction residual is quantized in the frequency domain based on a calculated noise masking threshold. In its simplest form, the TPC coder uses only open-loop quantization and therefore has a low complexity. A 16 kb/s full-duplex, open-loop TPC coder takes only 22% of the CPU load on a 150 MHz SGI Indy workstation and about 34% on a 90 MHz Pentium PC. The speech quality of TPC is almost transparent at 32 kb/s, very good at 24 kb/s, and acceptable at 16 kb/s. In the second half of the paper, we report our recent progress in using closed-loop quantization techniques to improve TPC output speech quality.
In existing predictive image coding systems, a prediction error is quantized based on the criterion to minimize the mean square value of the coding error. With this quantization scheme, the reduction of the coding err...
详细信息
ISBN:
(纸本)0780329120
In existing predictive image coding systems, a prediction error is quantized based on the criterion to minimize the mean square value of the coding error. With this quantization scheme, the reduction of the coding error is restricted by the number of the quantization levels and the amplitude distribution of the prediction error, and it is hard to locate more coding errors in less visible areas. In this paper, we present a new quantization scheme to improve the performance of the predictive image coding technique. In the proposed predictive coding system, instead of the prediction error, a contrast measure is generated from a predicted value and the original value of the current image pixel, which is then quantized based on an intuitive concept. Thus, a desirable weighting function is introduced in the coding algorithm such that the coding error is adapted by the pixel intensity and its contrast measure.
暂无评论