This work presents a 3-D wavelet video coding algorithm. By analyzing the contribution of each biorthogonal wavelet basis to reconstructed signal's energy, we weight each wavelet subband according to its basis ene...
详细信息
This work presents a 3-D wavelet video coding algorithm. By analyzing the contribution of each biorthogonal wavelet basis to reconstructed signal's energy, we weight each wavelet subband according to its basis energy. Based on distribution of weighted coefficients, we further discuss a 3-D wavelet tree structure named balanced significance probability tree, which places the coefficients with similar probabilities of being significant on the same layer. It is implemented by using hybrid spatial orientation tree and temporal-domain block tree. Subsequently, a novel 3-D wavelet video coding algorithm is proposed based on the energy-weighted balanced significance probability tree. Experimental results illustrate that our algorithm always achieves good reconstruction quality for different classes of video sequences. Compared with asymmetric 3-D orientation tree, the average peak signal-to-noise ratio (PSNR) gain of our algorithm are 1.24dB, 2.54dB and 2.57dB for luminance (Y) and chrominance (U,V) components, respectively. Compared with temporal-spatial orientation tree algorithm, our algorithm gains 0.38dB, 2.92dB and 2.39dB higher PSNR separately for Y, U, and V components. In addition, the proposed algorithm requires lower computation cost than those of the above two algorithms.
Previous research has shown that downsampling prior to encoding and upsampling after decoding can improve the rate-distortion (R-D) performance compared with directly coding the original video using standard technolog...
详细信息
Previous research has shown that downsampling prior to encoding and upsampling after decoding can improve the rate-distortion (R-D) performance compared with directly coding the original video using standard technologies, e.g., JPEG and H.264/AVC, especially at low bit rates. This paper proposes a practical algorithm to find the optimal downsampling ratio that balances the distortions caused by downsampling and coding, thus achieving the overall optimal R-D performance. Given the optimal sampling ratio, dedicated filters for down-and upsampling are also designed. Simulations show this algorithm improves the R-D performance over a wide range of bit rates.
This paper describes a prototype video coding platform(1) meant for the conception and testing of multimedia products such as next-generation videophones, The platform is largely based on ITU-T Recommendation H.263, w...
详细信息
This paper describes a prototype video coding platform(1) meant for the conception and testing of multimedia products such as next-generation videophones, The platform is largely based on ITU-T Recommendation H.263, with a number of additional object-oriented quality enhancement features which make it especially well suited for very low bit-rate coding of "head-and-shoulders" video material typical of real-time multimedia applications, video teleconferencing, and video telephony, These features consist of: 1) segmentation into objects of interest, 2) segmentation-based prefiltering, 3) model-assisted rate control, 4) adaptive vector quantization, and finally 5) segmentation-based postfiltering, In the spirit of Recommendation H.263, these enhancements are modular and can be selectively turned on or off, thereby enabling a wide variety of coding modes.
Motion compensated prediction is one of the essential methods to reduce temporal redundancy in inter coding. The target of motion compensated prediction is to predict the current frame from the list of reference frame...
详细信息
Motion compensated prediction is one of the essential methods to reduce temporal redundancy in inter coding. The target of motion compensated prediction is to predict the current frame from the list of reference frames. Recent video coding standards commonly use interpolation filters to obtain sub-pixel for the best matching block located in the fractional position of the reference frame. However, the fixed filters are not flexible to adapt to the variety of natural video contents. Inspired by the success of Convolutional Neural Network (CNN) in super-resolution, we propose CNN-based fractional interpolation for Luminance (Luma) and Chrominance (Chroma) components in motion compensated prediction to improve the coding efficiency. Moreover, two syntax elements indicate interpolation methods for the Luminance and Chrominance components, have been added to bin-string and encoded by CABAC using regular mode. As a result, our proposal gains 2.9%, 0.3%, 0.6% Y, U, V BD-rate reduction, respectively, under low delay P configuration.
In this paper, the parallelization of the H.261 video coding algorithm on the IBM SP2(R) multiprocessor system is described. The effect of parallelizing computations and communications in the spatial, temporal, and bo...
详细信息
In this paper, the parallelization of the H.261 video coding algorithm on the IBM SP2(R) multiprocessor system is described. The effect of parallelizing computations and communications in the spatial, temporal, and both spatial-temporal domains are considered through the study of frame rate, speedup, and implementation efficiency, which are modeled and measured with respect to the number of nodes (n) and parallel methods used. Four parallel algorithms were developed, of which the first two exploited the spatial parallelism in each frame, and the last two exploited both the temporal and spatial parallelism over a sequence of frames, The two spatial algorithms differ in that one utilizes a single communication master, while the other attempts to distribute communications across three masters. On the other hand, the spatial-temporal algorithms use a pipeline structure for exploiting the temporal parallelism together with either a single master or multiple masters. The best median speedup (frame rate) achieved was close to 15 [15 frames per second (fps)] for 552 x 240 video on 24 nodes, and 13 (37 fps) for QCIF video, by the spatial algorithm with distributed communications. For n < 10, the single-master spatial algorithm performs better with efficiency up to 90%, while the multiple-master spatial algorithm is superior for n > 10, with efficiency up to 70%. The spatial-temporal algorithms achieved average speedup performance, but are most scalable for large n.
Humans cannot perceive the minimal level of difference in the pixel variation. To overcome the problem, the concept of just-noticeable difference (JND) was proposed. JND measures the minimal amount that must be change...
详细信息
Humans cannot perceive the minimal level of difference in the pixel variation. To overcome the problem, the concept of just-noticeable difference (JND) was proposed. JND measures the minimal amount that must be changed for the variation to be detectable by humans. However, JND characteristics were not considered in the traditional perceptual measurements. In this paper, we provide a comprehensive survey of the latest JND-related studies. First, we provide an extensive overview of JND models. JND models comprise human visual system characteristics and masking effects. Next, we introduce the applications of JND models in the perceptual quality evaluation and video compression coding, especially in applying machine-learning techniques to JND prediction. In addition to a thorough summary of the recent progress and applications of JND, we summarize some unsolved technical challenges. We believe that our overview and findings can provide some insights into the related issues and future research directions in video coding.
A new region-based video coding technique, which combines region segmentation and geometric motion estimation, is proposed in this paper. The region segmentation algorithm based on both the histogram concavities and t...
详细信息
A new region-based video coding technique, which combines region segmentation and geometric motion estimation, is proposed in this paper. The region segmentation algorithm based on both the histogram concavities and the probabilistic relaxation is applied to extract the significant regions. The merging of regions in the segmentation depends on the attributes of the regions, besides the motion vectors. A new approach based on the geometric features of objects and the scalable translation invariant rotation-to-shifting (STIRS) signatures is applied to the motion estimation of the global regions. Vector quantization (VQ) techniques are employed to solve the problems due to nonrigid object motions. With these techniques, the critical regions can be extracted with good performance and relatively low complexity. Furthermore, an uncovered/overlapped motion compensation method is presented. The integral algorithm is compared to Yokoyama's algorithm (Yokoyama et al., 1995, IEEE Trans. Circuit Systems video Technol. 5, 500-507). Our simulations show improved performance, in terms of both SNR improvement and reduction of computational requirements. For example, we achieve about 15% reduction of computational time and about 0.8 dB improvement in SNR for the "Miss America sequence." The frame-to-frame SNR variation is also much smaller, which implies that the visual quality between frames is also very stable. (C) 2000 Academic Press.
The number of regions and length of contour are two basic constraints in segmentation-based motion-compensated video coding, This paper presents a coding scheme which focuses on region number reduction, contour coding...
详细信息
The number of regions and length of contour are two basic constraints in segmentation-based motion-compensated video coding, This paper presents a coding scheme which focuses on region number reduction, contour coding, and displaced frame difference (DFD) compression. One of the important features of the proposed scheme is a spatiotemporal simplification algorithm based on morphological filters, with which an image can be segmented into a small number of regions, Another important feature of the scheme is a segmentation map sampling technique which reduces contour length by about 50% with a very small reconstruction error. Experimental results show that, using the proposed scheme, a high compression ratio can be achieved with a small coding error for video sequences such as Miss America and Foreman.
The H.264 video coding standard exhibits higher performance compared to the other existing standards such as H. 263, MPEG-X. This improved performance is achieved mainly due to the multiple-mode motion estimation and ...
详细信息
The H.264 video coding standard exhibits higher performance compared to the other existing standards such as H. 263, MPEG-X. This improved performance is achieved mainly due to the multiple-mode motion estimation and compensation. Recent research tried to reduce the computational time using the predictive motion estimation, early zero motion vector detection, fast motion estimation, and fast mode decision, etc. These approaches reduce the computational time substantially, at the expense of degrading image quality and/or increase bitrates to a certain extent. In this paper, we use phase correlation to capture the motion information between the current and reference blocks and then devise an algorithm for direct motion estimation mode prediction, without excessive motion estimation. A bigger amount of computational time is reduced by the direct mode decision and exploitation of available motion vector information from phase correlation. The experimental results show that the proposed scheme outperforms the existing relevant fast algorithms, in terms of both operating efficiency and video coding quality. To be more specific, 82 similar to 92% of encoding time is saved compared to the exhaustive mode selection (against 58 similar to 74% in the relevant state-of-the-art), and this is achieved without jeopardizing image quality (in fact, there is some improvement over the exhaustive mode selection at mid to high bit rates) and for a wide range of videos and bitrates (another advantages over the relevant state-of-the-art).
The H.264/AVC video coding standard aims to enable significantly improved compression performance compared to all existing video coding standards. In order to achieve this, a robust rate-distortion optimization (RDO) ...
详细信息
The H.264/AVC video coding standard aims to enable significantly improved compression performance compared to all existing video coding standards. In order to achieve this, a robust rate-distortion optimization (RDO) technique is employed to select the best coding mode and reference frame for each macroblock. As a result, the complexity and computation load increase drastically. This paper presents a fast mode decision algorithm for H.264/AVC intraprediction based on local edge information. Prior to intraprediction, an edge map is created and a local edge direction histogram is then established for each subblock. Based on the distribution of the edge direction histogram, only a small part of intraprediction modes are chosen for RDO calculation. Experimental results show that the fast intraprediction mode decision scheme increases the speed of intracoding significantly with negligible loss of peak signal-to-noise ratio.
暂无评论