The Versatile Video coding (VVC) standard introduces a block partitioning structure known as quadtree plus nested multi-type tree (QTMTT), which allows more flexible block partitioning compared to its predecessors, li...
详细信息
The Versatile Video coding (VVC) standard introduces a block partitioning structure known as quadtree plus nested multi-type tree (QTMTT), which allows more flexible block partitioning compared to its predecessors, like High Efficiency Video coding (HEVC). Meanwhile, the partition search (PS) process, which is to find out the best partitioning structure for optimizing the rate-distortion cost, becomes far more complicated for VVC than for HEVC. Also, the PS process in VVC reference software (VTM) is not friendly to hardware implementation. We propose a partition map prediction method for fast block partitioning in VVC intra-frame encoding. The proposed method may replace PS totally or be combined with PS partially, thereby achieving adjustable acceleration of the VTM intra-frame encoding. Different from the previous methods for fast block partitioning, we propose to represent a QTMTT-based block partitioning structure by a partition map, which consists of a quadtree (QT) depth map, several multi-type tree (MTT) depth maps, and several MTT direction maps. We then propose to predict the optimal partition map from the pixels through a convolutional neural network (CNN). We propose a CNN structure, known as Down-Up-CNN, for the partition map prediction, where the CNN structure emulates the recursive nature of the PS process. Moreover, we design a post-processing algorithm to adjust the network output partition map, so as to obtain a standard-compliant block partitioning structure. The post-processing algorithm may produce a partial partition tree as well;then based on the partial partition tree, the PS process is performed to obtain the full tree. Experimental results show that the proposed method achieves 1.61 x to 8.64 x encoding acceleration for the VTM-10.0 intra-frame encoder, with the ratio depending on how much PS is performed. Especially, when achieving 3.89 x encoding acceleration, the compression efficiency loss is 2.77% in BD-rate, which is a better tradeoff
With the growth of video technologies, super-resolution videos, including 360-degree immersive video has become a reality due to exciting applications such as augmented/virtual/mixed reality for better interaction and...
详细信息
ISBN:
(纸本)9781728193205
With the growth of video technologies, super-resolution videos, including 360-degree immersive video has become a reality due to exciting applications such as augmented/virtual/mixed reality for better interaction and a wideangle user-view experience of a scene compared to traditional video with narrow-focused viewing angle. The new generation video contents are bandwidth-intensive in nature due to high resolution and demand high bit rate as well as low latency delivery requirements that pose challenges in solving the bottleneck of transmission and storage burdens. There is limited optimisation space in traditional video coding schemes for improving video coding efficiency in intra-frame due to the fixed size of processing block. This paper presents a new approach for improving intra-frame coding especially at low bit rate video transmission for 360-degree video for lossy mode of HEVC. Prior to using traditional HEVC intra-prediction, this approach exploits the global redundancy of entire frame by extracting common important information using multi-level discrete wavelet transformation. This paper demonstrates that the proposed method considering only low frequency information of a frame and encoding this can outperform the HEVC standard at low bit rates. The experimental results indicate that the proposed intra-frame coding strategy achieves an average of 54.07% BD-rate reduction and 2.84 dB BD-PSNR gain for low bit rate scenario compared to the HEVC. It also achieves a significant improvement in encoding time reduction of about 66.84% on an average. Moreover, this finding also demonstrates that the existing HEVC block partitioning can be applied in the transform domain for better exploitation of information concentration as we applied HEVC on wavelet frequency domain.
intra-frame prediction in the High Efficiency Video coding (HEVC) standard can be empirically improved by applying sets of recursive two-dimensional filters to the predicted values. However, this approach does not all...
详细信息
ISBN:
(纸本)9781467399616
intra-frame prediction in the High Efficiency Video coding (HEVC) standard can be empirically improved by applying sets of recursive two-dimensional filters to the predicted values. However, this approach does not allow (or complicates significantly) the parallel computation of pixel predictions. In this work we analyze why the recursive filters are effective, and use the results to derive sets of non-recursive predictors that have superior performance. We present an extension to HEVC intra prediction that combines values predicted using non-filtered and filtered (smoothed) reference samples, depending on the prediction mode, and block size. Simulations using the HEVC common test conditions show that a 2.0% bit rate average reduction can be achieved compared to HEVC, for All intra (AI) configurations.
As a cutting-edge field of global research, 3D video technology faces the dual challenges of large data volumes and high processing complexity. Although the most recent video coding standard VVC, surpasses HEVC in cod...
详细信息
As a cutting-edge field of global research, 3D video technology faces the dual challenges of large data volumes and high processing complexity. Although the most recent video coding standard VVC, surpasses HEVC in coding efficiency, dedicated research on 3D video coding remains relatively scarce. Building on existing research, this study aims to develop a 3D video coding algorithm based on VVC that lowers the complexity of the encoding procedure. We focus specifically on the depth maps in 3D video content and introduce an extreme forest model from machine learning to optimize intra-frame coding. This paper proposes a novel CU partitioning strategy implemented through a two-stage extreme forest model. First, the initial model predicts the CU partitioning type, including no partition, QuadTree partitioning, Multi-type tree horizontal partitioning, and Multi-type tree vertical partitioning. For the latter two cases, a second model further refines the partitioning into binary or ternary trees. Through this two-stage prediction mechanism, we effectively bypass CU partitioning types with low probability, significantly reducing the coding complexity. The experimental results demonstrate that the proposed algorithm saves 47.46% in encoding time while maintaining coding quality, with only a 0.26% increase in Bjontegaard Delta Bitrate. This achievement provides an effective low-complexity solution for the 3D video coding field.
Modern image and video compression technologies include both lossless compression methods, such as entropy coding, Inter-frame and intra-frame coding, and lossy compression methods, such as discrete orthogonal transfo...
详细信息
ISBN:
(纸本)9783030366254;9783030366247
Modern image and video compression technologies include both lossless compression methods, such as entropy coding, Inter-frame and intra-frame coding, and lossy compression methods, such as discrete orthogonal transforms with quantization. All these techniques are actively applied in video codecs based on the H.264 and H.265 standards. Discrete wavelet transform (DWT) is one of the most perspective versions of discrete orthogonal transforms. Both intra-frame coding and wavelet decomposition of images allow to reduce the volume of transmitted data, but they are not used together in video coding systems. The combined usage of these methods seems to be a promising approach in terms of visual data compression. Thereby the aim of this research is to develop a new technique based on the combination of intra-frame coding and the DWT and to test the applicability of the proposed method in the image compression tasks. The effectiveness of various implementations of the proposed algorithm, including those based on contexts and using several levels of wavelet decomposition of images, was evaluated in the study.
This article presents a performance analysis of Versatile Video coding (VVC) intra-frame prediction. VVC is the next generation of video coding standards, which has been developed to supply the demand of upcoming vide...
详细信息
This article presents a performance analysis of Versatile Video coding (VVC) intra-frame prediction. VVC is the next generation of video coding standards, which has been developed to supply the demand of upcoming video applications. VVC brings several innovations and enhancements for the intra-frame prediction to improve the encoding efficiency. These improvements comprise larger block sizes, more flexible block partitioning, more angular intra-frame prediction modes, multiple transform selection, non-separable secondary transform, among others. This article provides a detailed description of these tools, discussing how they work together in the intraframecoding flow to raise the compression performance. Moreover, this article presents encoding complexity, encoding usage distribution, and rate-distortion-complexity analyses of the intra-frame prediction tools over different quantization scenarios. Based on these analyses, this article provides support for future works focusing on VVC intra-frame coding, including complexity reduction, complexity control, and real-time hardware design.
To compress screen image sequence in real-time remote and interactive applications,a novel compression method is *** proposed method is named as *** employs hybrid coding schemes that consist of intra-frame and inter-...
详细信息
To compress screen image sequence in real-time remote and interactive applications,a novel compression method is *** proposed method is named as *** employs hybrid coding schemes that consist of intra-frame and inter-framecoding *** intra-frame coding is a rate-distortion optimized adaptive block size that can be also used for the compression of a single screen *** inter-framecoding utilizes hierarchical group of pictures(GOP) structure to improve system performance during random accesses and fast-backward *** results demonstrate that the proposed CABHG method has approximately 47%-48% higher compression ratio and 46%-53% lower CPU utilization than professional screen image sequence codecs such as TechSmith Ensharpen codec and Sorenson 3 *** with general video codecs such as H.264 codec,XviD MPEG-4 codec and Apple's Animation codec,CABHG also shows 87%-88% higher compression ratio and 64%-81% lower CPU utilization than these general video codecs.
Classical video prediction methods exploit directly and shallowly the intra-frame, inter-frame and multi-view similarities within the video sequences;the proposed video prediction methods indirectly and intensively tr...
详细信息
Classical video prediction methods exploit directly and shallowly the intra-frame, inter-frame and multi-view similarities within the video sequences;the proposed video prediction methods indirectly and intensively transform the frame correlations into nonlinear mappings by using a general deep neural network (DNN) with single output node. Traditional DNN based video prediction algorithms wholly and coarsely forecast the next frame, but the proposed video prediction algorithms severally and precisely anticipate single pixel of future frame in order to achieve high prediction accuracy and low computation cost. First of all, general DNN based prediction algorithms for intra-frame coding, inter-framecoding and multi-view coding are presented respectively. Then, general DNN based prediction algorithm for unified video coding is raised, which relies on the preceding three prediction algorithms. It is evaluated by simulation experiments that the proposed methods hold better performance than state of the art High Efficiency Video coding (HEVC) in peak signal to noise ratio (PSNR) and bit per pixel (BPP) in the situation of low bitrate transmission. It is also verified by experimental results that the proposed general DNN architecture possesses higher prediction accuracy and lower computation load than those of conventional DNN architectures. It is further testified by experimental results that the proposed methods are very suitable for multi-view videos with small correlations and big disparities. (C) 2017 Elsevier B.V. All rights reserved.
In this paper, we propose an enhanced rate-distortion cost function, which combines the sum of absolute integer-transformed differences (SAITD) and a rate predictor for H.264/AVC intra-4 x 4 mode decision. To reduce t...
详细信息
In this paper, we propose an enhanced rate-distortion cost function, which combines the sum of absolute integer-transformed differences (SAITD) and a rate predictor for H.264/AVC intra-4 x 4 mode decision. To reduce the computation of the SAITD, we further develop a fast computation algorithm, which successfully uses the property of linear transform and the fixed spatial relationship of predicted pixels in intra-modes. Simulation results show that the enhanced cost function with less computation achieves better coding performance than the cost function suggested in H.264 reference software JM 6.1d.
This paper presents a 1080p 60 Hz intra-frame CODEC system for zero delay AV streaming. For high quality streaming, the proposed CODEC employs an RGB-domain inter-color compensation algorithm using strong correlation ...
详细信息
This paper presents a 1080p 60 Hz intra-frame CODEC system for zero delay AV streaming. For high quality streaming, the proposed CODEC employs an RGB-domain inter-color compensation algorithm using strong correlation between RGB color components which was previously presented by the authors of this paper. The proposed CODEC architecture is based on macroblock-level pipelining and parallel processing to handle a significant pixel rate of 1080p 60 Hz videos with RGB 4: 4: 4 format, i.e., 3 Gbps. Since syntax processing is a bottleneck to supporting speeds of up to 100 Mbps in real-time, a high performance context-adaptive variable length coding architecture exploiting the look-ahead technique is included in the proposed design. Also, the number of syntax symbols is adaptively restricted to accomplish zero end-to-end delay. Finally, by using MPEG-2 TS as the AV stream format, compatibility with general channel chips is guaranteed. Using TSMC 90 nm CMOS technology, the prototype chip is implemented with 1,208 K logic gates and 359 Kb internal SRAM. The chip can achieve real-time encoding and decoding of 1080p 60 Hz videos at 200 MHz execution speed in typical work conditions.
暂无评论