An encoder-dependent video cut detection algorithm is proposed. Based on the inertia property of natural videos, the proposed algorithm detects video cuts in the video-coding loop by making use of the intermediate res...
详细信息
An encoder-dependent video cut detection algorithm is proposed. Based on the inertia property of natural videos, the proposed algorithm detects video cuts in the video-coding loop by making use of the intermediate results of video compression. Experiments show that the detection performance of the proposed algorithm is quite good as compared to previous works, and when it is integrated with the coder, not only is the content accessibility of the output code stream improved but also the compression ratio is improved. The authors regard this inertia-based algorithm as a step towards the integration of video compression and content-based video retrieval.
Current rate control schemes in MCTF-based wavelet video coding lack efficient GOP-level bit allocation due to the shortage of GOP rate-distortion (R-D) information and the limitations of real-time encoding. Exploitin...
详细信息
ISBN:
(纸本)9789811065712;9789811065705
Current rate control schemes in MCTF-based wavelet video coding lack efficient GOP-level bit allocation due to the shortage of GOP rate-distortion (R-D) information and the limitations of real-time encoding. Exploiting the advantage of offline video encoding, a GOP-level bit allocation scheme based on GOP R-D data estimated from the wavelet decomposed temporal-spatial sub-band R-D is described. Each video sequence is encoded twice. In the first pass, the R-D information of the GOP is generated, and this is utilised in the second pass to implement GOP-level bit allocation appropriate for the available bandwidth and such that the video sequence is coded with near-constant quality. Experimental results demonstrate that the proposed technique achieves a smoother, and hence more visually acceptable, quality than existing methods.
The paper presents a multiple description (MD) video coder based on three-dimensional (3D) transforms. The coder has low computa- tional complexity and high robustness to transmission errors and is targeted to mobile ...
详细信息
The paper presents a multiple description (MD) video coder based on three-dimensional (3D) transforms. The coder has low computa- tional complexity and high robustness to transmission errors and is targeted to mobile devices. The encoder represents video sequence in a form of coarse sequence approximation (shaper) included in both descriptions and residual sequence (details) split between two descriptions. The shaper is obtained by block-wise pruned 3D- DCT. The residual sequence is coded by 3D-DCT or hybrid 3D- transform. The coding scheme is simple and yet outperforms some MD coders based on motion-compensated prediction, especially in the low-redundancy region.
The paper illustrates a method for affine warping-based motion compensation using the computing architecture of H263 advanced prediction (AP) mode, namely, changing the (constant) weighting matrices in the H263 AP alg...
详细信息
The paper illustrates a method for affine warping-based motion compensation using the computing architecture of H263 advanced prediction (AP) mode, namely, changing the (constant) weighting matrices in the H263 AP algorithm. In particular, we analytically derive a realization of affine image prediction based on a regular mesh in presence of linear resampling. The work shows, by means of analytical derivation and simulation results, an intrinsic similarity relationship between the affine motion model and the H.263 advanced prediction technique. The analysis offers a rational basis to explain the performance of H.263 AP as a coarse approximation of affine prediction. (C) 1998 Elsevier Science B.V. All rights reserved.
Exploiting spatial redundancy in images is responsible for a large gain in the performance of image and video compression. The main tool to achieve this is called intra-frame prediction. In most state-of-the-art video...
详细信息
Exploiting spatial redundancy in images is responsible for a large gain in the performance of image and video compression. The main tool to achieve this is called intra-frame prediction. In most state-of-the-art video coders, intra prediction is applied in a block-wise fashion. Up to now angular prediction was dominant, providing a low-complexity method covering a large variety of content. With deep learning, however, it is possible to create prediction methods covering a wider range of content, being able to predict structures which traditional modes can not predict accurately. Using the conditional autoencoder structure, we are able to train a single artificial neural network which is able to perform multi-mode prediction. In this paper, we derive the approach from the general formulation of the intra-prediction problem and introduce two extensions for spatial mode prediction and for chroma prediction support. Moreover, we propose a novel latent-space-based cross component prediction. We show the power of our prediction scheme with visual examples and report average gains of 1.13% in Bjontegaard delta rate in the luma component and 1.21% in the chroma component compared to VTM using only traditional modes.
A virtual SPIHT (VSPIHT) technique that is based on the SPIHT and virtual zero-tree concepts is proposed. It combines virtually generated zero trees on the top of SPIHT's zero trees to reduce the effective number ...
详细信息
A virtual SPIHT (VSPIHT) technique that is based on the SPIHT and virtual zero-tree concepts is proposed. It combines virtually generated zero trees on the top of SPIHT's zero trees to reduce the effective number of zero trees. The simulation results show that the proposed video coder is more efficient compared to SPIHT based video coders, particularly at very low bit rates.
MPEG-2 is an extension of the MPEG-1 international standard for digital compression of audio and video signals. MPEG-1 was designed to code progressively scanned video at bit rates up to about 1.5 Mbit/s for applicati...
详细信息
MPEG-2 is an extension of the MPEG-1 international standard for digital compression of audio and video signals. MPEG-1 was designed to code progressively scanned video at bit rates up to about 1.5 Mbit/s for applications such as CD-i (compact disc interactive). MPEG-2 is directed at broadcast formats at higher data rates;it provides extra algorithmic 'tools' for efficiently coding interlaced video, supports a wide range of bit rates and provides for multichannel surround sound coding. This tutorial paper introduces the principles used for compressing video according to the MPEG-2 standard, outlines the general structure of a video coder and decoder, and describes the subsets ('profiles') of the toolkit and the sets of constraints on parameter values ('levels') defined to date.
Color pictures are usually compressed in a luminance-chrominance coordinate space. We consider the problem of encoding the chrominance information for very low bit rate video coding systems aimed at bit rates in the r...
详细信息
ISBN:
(纸本)0780331222
Color pictures are usually compressed in a luminance-chrominance coordinate space. We consider the problem of encoding the chrominance information for very low bit rate video coding systems aimed at bit rates in the range 8 to 40 kbps. The challenge is that the chrominance components typically get less than 10 to 20% of the total very low bit rate allocated for the video data. We found that it is sufficient to encode the chrominance information at 1/8 of the luminance resolution in both the horizontal and vertical directions. While, for many of the previous coding methods, the compression is performed independently for the luminance and chrominance coordinates, we propose a coding scheme which exploits the coded luminance data in coding and retrieving the chrominance components. The proposed video coder is an improved extension of an existing luminance-only coder so that color motion video can be coded at very low bit rates under fixed frame and bit rate constraints. It is based on a hybrid waveform coding technique with an implicit model-based component. Very good results were obtained for head-and-shoulders sequences even with chroma rates of less than 7% of the total very low bit rate. In addition, subjective tests indicate that the coded chrominance information improves the visual perception of noisy image features.
This paper presents a low power and high speed 3D-DWT (three-dimensional discrete wavelet transform) architecture for image compression. With the recent expansion of multimedia applications and the need for delivering...
详细信息
ISBN:
(纸本)9781479933587
This paper presents a low power and high speed 3D-DWT (three-dimensional discrete wavelet transform) architecture for image compression. With the recent expansion of multimedia applications and the need for delivering compressed bit streams over heterogeneous networks, scalability has become an important feature for video coders. The 3-D discrete wavelet transform provides a natural spatial resolution and frame rate scalability. Scalability is achieved in tern poral and spatial resolutions, as well as in quality. The advantages of this scheme when they are compared with those of existing ones are in terms of complexity, low power, high throughput, low latency and minimum storage requirement. The proposed architecture has been successfully implemented on Xilinx Spartan 3E series field-programmable gate array, suitable for real-time compression.
This paper investigates the adaptive dynamic rate controlled video transmission for robust video communication under packet wireless environment. The video coder comprises of wavelet transform (WT), multi-resolution m...
详细信息
ISBN:
(纸本)0818681837
This paper investigates the adaptive dynamic rate controlled video transmission for robust video communication under packet wireless environment. The video coder comprises of wavelet transform (WT), multi-resolution motion estimation (MRME) and robust zero tree coder The robust zero tree coder is a modification of the basic coder proposed by Shapiro to adopt coefficient level partitioning so as to slice out erroneous parts at the receiver and use the rest effectively. The adaptive dynamic rate controlling is required to adapt the video communication to the channel conditions so as to provide higher channel protection when the channel is severe and, improve the source rate and provide better performance when the conditions are favorable. The system evaluation is carried our under narrowband and broadband channel conditions. Both objective and subjective results are shown at 64 Kbps.
暂无评论