This paper develops error concealment methods for multiple description video coding (MDC) in order to adapt to error prone packet networks. The three-loop slice group MDC approach of D. Wang et al. (2005) is used. MDC...
详细信息
This paper develops error concealment methods for multiple description video coding (MDC) in order to adapt to error prone packet networks. The three-loop slice group MDC approach of D. Wang et al. (2005) is used. MDC is very suitable for multiple channel environments, and especially able to maintain acceptable quality when some of these channels fail completely, i.e. in an on-off MDC environment, without experiencing any drifting problem. Our MDC scheme coupled with the proposed concealment approaches proved to be suitable not only for the on-off MDC environment case (data from one channel fully lost), but also for the case where only some packets are lost from one or both channels. Copying video and using motion vectors from correct descriptions are combined together for concealment prior to applying traditional methods. Results are compared to the traditional error concealment method proposed in the H.264 reference software, showing significant improvements for both the balanced and unbalanced channel cases.
In motion-compensated video-coding schemes, such as MPEG, an I frame is normally followed by several P frames and possibly B frames in a group-of-picture (GOP). In error-prone environments, errors happening in the pre...
详细信息
ISBN:
(纸本)0780388348
In motion-compensated video-coding schemes, such as MPEG, an I frame is normally followed by several P frames and possibly B frames in a group-of-picture (GOP). In error-prone environments, errors happening in the previous frames in a GOP may propagate to all the following frames until the next I frame, which is the beginning of the next GOP. In this paper, we propose a novel GOP structure for robust transmission of MPEG video bitstream. By selecting the optimal position of the I frame in a GOP, robustness can be achieved without reducing any coding efficiency. Experimental results demonstrate the robustness of the proposed GOP structure.
The wavelet-based video codec is capable of providing scalable bitstreams such that quality of service (QoS) among heterogeneous networks can be manipulated in a unified framework. The three-dimensional wavelet video ...
详细信息
The wavelet-based video codec is capable of providing scalable bitstreams such that quality of service (QoS) among heterogeneous networks can be manipulated in a unified framework. The three-dimensional wavelet video coder 3D WVC had been developed to provide this fully scalable bit-stream. However, the quality control or rate allocation was performed on the scale of one GOP such that the reconstructed pictures suffer temporal quality fluctuations. The theoretical coding model provided ratio parameters among subbands to smooth temporal picture quality. However, it needs further adaptation for dealing with practical WVC. We propose to investigate the rate-distortion relation, between the decomposed subbands (Psi) and the reconstructed pictures in one GOP (P), under which the rate allocation for constant quality (RACQ) can be operated. Experiments show that the temporal quality fluctuations can be reduced to 10-30 times smaller as compared to 3D-SPIHT and 5-10 times smaller to the ratio parameter approach
An alternative to scalable predictive coding of first order Gauss-Markov processes is proposed in this paper. It is shown that conventional scalable predictive coding is inherently suboptimal. An alternative to scalab...
详细信息
An alternative to scalable predictive coding of first order Gauss-Markov processes is proposed in this paper. It is shown that conventional scalable predictive coding is inherently suboptimal. An alternative to scalable predictive coding, which achieves the rate-distortion performance of predictive coding for first-order Gauss-Markov processes is then proposed. The proposed approach is posed as a variant of the well-known Wyner-Ziv (1976) problem. By using coset codes with nested lattices, the present paper proves that the proposed approach achieves the predictive coding bound asymptotically at all scales while simultaneously providing the functionality of scalable coding.
We present a new architecture called the Modular Neural predictive coding architecture (Modular NPC). This architecture is used for speech discriminant feature extraction (DFE). We present an application of the modula...
详细信息
ISBN:
(纸本)9810475241
We present a new architecture called the Modular Neural predictive coding architecture (Modular NPC). This architecture is used for speech discriminant feature extraction (DFE). We present an application of the modular NPC architecture on phoneme recognition task. The phonemes which are extracted from the Darpa-Timit speech database are: vowels, /b/-/d/-/g/ and /p/-/t/-/k/ phonemes. Comparisons with coding methods (LPC, MFCC, PLP) are presented.
We present a predictive neural network called neural predictive coding (NPC). This model is used for nonlinear discriminant features extraction applied to phoneme recognition. We validate the nonlinear prediction impr...
详细信息
We present a predictive neural network called neural predictive coding (NPC). This model is used for nonlinear discriminant features extraction applied to phoneme recognition. We validate the nonlinear prediction improvement of the NPC model. We also, present a new extension of the NPC model: NPC-3. In order to evaluate the performances of the NPC-3 model, we carried out a study of Darpa-Timit phonemes (in particular /b/, /d/, /g/ and /p/, /t/, /q/ phonemes) recognition. Comparisons with traditional coding methods are presented. We also show how an adaptative constraint allows improvements on the recognition task.
In JPEG-LS, simple edge detection techniques are used in determining the predictive value of each pixel. These techniques only detect horizontal/vertical edges and have only been optimized for the prediction of pixels...
详细信息
In JPEG-LS, simple edge detection techniques are used in determining the predictive value of each pixel. These techniques only detect horizontal/vertical edges and have only been optimized for the prediction of pixels in the locality of such edges. Thus, JPEG-LS produces large prediction errors in the locality of diagonal edges. We propose a low complexity technique that accurately detects diagonal edges and efficiently predicts pixels, based on the information available within the standard predictive template of JPEG-LS. We show that the proposed technique outperforms JPEG-LS in terms of predicted mean squared error, by margins of up to 15%. (C) 2003 Elsevier B.V. All rights reserved.
Visual perception involves the grouping of individual elements into coherent patterns, such as object representations, that reduce the descriptive complexity of a visual scene. The computational and physiological base...
详细信息
Visual perception involves the grouping of individual elements into coherent patterns, such as object representations, that reduce the descriptive complexity of a visual scene. The computational and physiological bases of this perceptual remain poorly understood. We discuss recent fMRI evidence from our laboratory where we measured activity in a higher object processing area (LOC), and in primary visual cortex (V1) in response to visual elements that were either grouped into objects or randomly arranged. We observed significant activity increases in the LOC and concurrent reductions of activity in V1 when elements formed coherent shapes, suggesting that activity in early visual areas is reduced as a result of grouping processes performed in higher areas. In light of these results we review related empirical findings of context-dependent changes in activity, recent neurophysiology research related to cortical feedback, and computational models that incorporate feedback operations. We suggest that feedback from high-level visual areas reduces activity in lower areas in order to simplify the description of a visual image-consistent with both predictive coding models of perception and probabilistic notions of 'explaining away.' (C) 2004 Elsevier Ltd. All rights reserved.
In this paper, we regard the sequence of returns as outputs from a parametric compound source. Utilizing the fact that the coding rate of the source shows the amount of information about the return, we describe l-lear...
详细信息
In this paper, we regard the sequence of returns as outputs from a parametric compound source. Utilizing the fact that the coding rate of the source shows the amount of information about the return, we describe l-learning algorithms based on the predictive coding idea for estimating an expected information gain concerning future information and give a convergence proof of the information gain. Using the information gain, we propose the ratio w of return loss to information gain as a new criterion to be used in probabilistic action-selection strategies. In experimental results, we found that our w-based strategy performs well compared with the conventional Q-based strategy.
We derive a recurrent neural network architecture of single cells in the primary visual cortex that dynamically improves a 2D-Gabor wavelet based representation of an image by minimizing the corresponding reconstructi...
详细信息
We derive a recurrent neural network architecture of single cells in the primary visual cortex that dynamically improves a 2D-Gabor wavelet based representation of an image by minimizing the corresponding reconstruction error via feedback connections. Furthermore, we demonstrate that the reconstruction error is a Lyapunov function of the herein proposed recurrent network. Our model of the primary visual cortex combines a modulatory feedforward strategy and a feedback subtractive correction for obtaining an optimal coding. The fed back error is used in our system for a dynamical improvement of the feedforward Gabor representation of the images, in the sense that the feedforward redundant representation due to the non-orthogonality of the Gabor wavelets is dynamically corrected. The redundancy of the Gabor feature representation is therefore dynamically eliminated by improving the reconstruction capability of the internal representation. The dynamics therefore introduce a nonlinear correction to the standard linear representation of Gabor filters that generates a more efficient predictive coding.
暂无评论