Wyner-Ziv coding, also known as distributed video coding, is currently a very hot research topic in video coding due to the new opportunities it opens. This paper applies the distributed video coding principles to ste...
详细信息
ISBN:
(纸本)9781424412730
Wyner-Ziv coding, also known as distributed video coding, is currently a very hot research topic in video coding due to the new opportunities it opens. This paper applies the distributed video coding principles to stereo video coding, to propose a practical solution for Wyner-Ziv stereo coding based on mask-based fusion of temporal and spatial side informations. The architecture includes a low-complexity encoder and avoids any communication between the cameras/encoders. While the rate-distortion (RD) performance strongly depends on the motion-based frame interpolation (MBFI) and disparity-based frame estimation (DBFE) solutions, first results show that the proposed approach is promising and there are still issues to address.
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication. However, SER has long suffered from a lack of public large-scale labeled datasets. To circumvent this problem, w...
详细信息
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication. However, SER has long suffered from a lack of public large-scale labeled datasets. To circumvent this problem, we investigate how unsupervised representation learning on unlabeled datasets can benefit SER. We show that the contrastive predictive coding (CPC) method can learn salient representations from unlabeled datasets, which improves emotion recognition performance. In our experiments, this method achieved state-of-the-art concordance correlation coefficient (CCC) performance for all emotion primitives (activation, valence, and dominance) on IEMOCAP. Additionally, on the MSP-Podcast dataset, our method obtained considerable performance improvements compared to baselines.
The H.264/AVC standard employs the predictive motion vector coding technique using the median predictor of spatially neighboring three motion vectors. Although the median is effective in reducing redundancy, it is not...
详细信息
The H.264/AVC standard employs the predictive motion vector coding technique using the median predictor of spatially neighboring three motion vectors. Although the median is effective in reducing redundancy, it is not always optimal in minimizing bits. To solve the matter, a new motion vector coding scheme, known as, MV competition in which decoder is signaled on the selected optimal PMV, has been reported. Though it can use the optimal PMV(Predicted Motion Vector), the bits consumed to indicating the optimal PMV to the decoder increases bit-rate. In this paper, we propose a new motion vector coding scheme that allows usage of an optimal PMV without consuming additional bits to inform the choice of PMV to decoder. Simulation results show that the proposed method gains in BDBR by 3.22% on average, and in BDPSNR by 0.13 dB compared to the H.264/AVC.
Two temporally scalable video coding techniques, temporal subband coding (TSB) and predictive coding, are evaluated both theoretically and in practice to provide comparisons of compression and visual quality at differ...
详细信息
Two temporally scalable video coding techniques, temporal subband coding (TSB) and predictive coding, are evaluated both theoretically and in practice to provide comparisons of compression and visual quality at differing frame rates. Results demonstrate that TSB coding has a higher coding gain at full frame rate than predictive coding if both algorithms use either no motion compensation or motion compensation (MC) that can be modeled as an invertible pre-distortion of the video sequence. However, predictive coding outperforms TSB coding at full frame rate when both schemes use block-based MC. In addition, visual results for lower-frame-rate video using TSB coding are unacceptable due to significant distortions from temporal filtering. These results are demonstrated through both theoretical evaluation of rate-distortion based coding gains and simulated coding of real video sequences.
The modified discrete cosine transform (MDCT) is widely used in current perceptual audio coding schemes. This paper presents an integer approximation of this lapped transform, called integer MDCT, which is derived fro...
详细信息
The modified discrete cosine transform (MDCT) is widely used in current perceptual audio coding schemes. This paper presents an integer approximation of this lapped transform, called integer MDCT, which is derived from the MDCT using lifting scheme. This reversible integer transform inherits most of the attractive properties of the MDCT, exhibiting a good spectral representation of the audio signal, critical sampling and overlapping, which makes the integer MDCT well suited for both lossless audio coding as well as for combined perceptual and lossless audio coding. A scalable system based on MPEG AAC and integer MDCT is presented providing a lossless enhancement of perceptual audio coding scheme.
The compression efficiency of distributed video-coding (DVC) suffers from the necessity of transmitting a large number of key-frames which are intra-coded. This paper describes a new 3D model-based DVC approach which ...
详细信息
The compression efficiency of distributed video-coding (DVC) suffers from the necessity of transmitting a large number of key-frames which are intra-coded. This paper describes a new 3D model-based DVC approach which reduces the key- frame frequency. The decoder first recovers a 3D model from the key-frames. It then predicts the intermediate frames by projecting it onto 2D image planes and applying image-based rendering techniques. This paper also introduces a new quasi-DVC method relying on a limited point tracking at the encoder. It greatly improves the prediction PSNR, while only slightly increasing the encoder complexity. It also allows the encoder to adaptively select the key-frames based on the video motion-content.
Steganography is the art and science of hiding secret data to provide a safe communication between two parties and it is a prominent branch in the information hiding research area. This paper presents a new steganogra...
详细信息
Steganography is the art and science of hiding secret data to provide a safe communication between two parties and it is a prominent branch in the information hiding research area. This paper presents a new steganographic method based on predictive coding and embeds secret message in quantized error values via quantization index modulation (QIM). The proposed method is superior to previous methods in that it can make a satisfying balance among the most concerned criteria in steganography which are imperceptibility, hiding capacity, compression ratio and robustness against attacks. The performance of the proposed method is evaluated by several experiments on gray-level images with different textural properties. The new method is also compared with two renowned steganographic methods namely Jsteg and steganography based on predictive coding (SBPC). The results obtained from the experiments show that the proposed method has high visual quality and less histogram distortion while it has satisfactory compression ratio and embedding size.
This paper presents an object-based layer-structure very low bit rate video coding system. This system is different from the conventional object-based video coding algorithms in three aspects. First, it uses the elast...
详细信息
ISBN:
(纸本)0780332598
This paper presents an object-based layer-structure very low bit rate video coding system. This system is different from the conventional object-based video coding algorithms in three aspects. First, it uses the elastic stretching technique to do the object-based predictive coding. Second, to reduce the coding cost of the contours, it applies contour matching and contour displacement to predict the contour in the next frame. Third, it uses the arbitrary shape transform (AST) coding to reduce the coding bit rate for the prediction error in the arbitrarily-shaped regions.
Summary form only given. We propose a lossless algorithm of delta compression (a variant of predictive coding) that attempts to predict the next point from previous points using higher-order polynomial extrapolation. ...
详细信息
Summary form only given. We propose a lossless algorithm of delta compression (a variant of predictive coding) that attempts to predict the next point from previous points using higher-order polynomial extrapolation. In contrast to traditional predictive coding our method takes into account varying (non-equidistant) domain (typically, time) steps. To save space and guarantee lossless compression, the actual and predicted values are converted to 64-bit integers. The residual (difference between actual and predicted values) is computed as difference of integers. The unnecessary bits of the residual are truncated, e.g., 1111110101 is replaced by 10101. The length of the bit sequence (5/sub 10/=(000101)/sub 2/) is prepended.
In this paper we investigate the impact of transmission errors in H.264. Transmission errors propagate into subsequent frames due to motion prediction and result in degraded video quality. Our simulations show that H....
详细信息
ISBN:
(纸本)9781424412730
In this paper we investigate the impact of transmission errors in H.264. Transmission errors propagate into subsequent frames due to motion prediction and result in degraded video quality. Our simulations show that H.264 exhibits non-fading behaviour. We propose a method that introduces a fading characteristic and can eliminate the error propagation after a few frames. We provide a detailed analysis of our results based on a comparison with MPEG-4 and the residual energy per frame.
暂无评论