In the recent Future Video Coding (FVC) standard developed by the Joint Video Exploration Team (JVET), the quad-treebinary-tree (QTBT) block partition module makes use of rectangular block forms and additional square...
详细信息
In the recent Future Video Coding (FVC) standard developed by the Joint Video Exploration Team (JVET), the quad-treebinary-tree (QTBT) block partition module makes use of rectangular block forms and additional square block sizes compared to quad-tree (QT) block partitioning module proposed in the predecessor High-Efficiency Video Coding (HEVC) standard. This block flexibility, induced with the QTBT module, significantly improves compression performance while it dramatically increases coding complexity due to the brute force search for Rate Distortion Optimization (RDO). To cope with this issue, it is necessary to consider the unique characteristics of QTBT in FVC. In this paper, we propose a fast QT partitioning algorithm based on a deep convolutional neural network (CNN) model to predict coding unit (CU) partition instead of RDO which enhances considerably QTBT performance for intra-mode coding. Based on a suitable diversified CU partition patterns database, the optimization process is set up with three levels CNN structure developed to learn the split or non-split decision from the established database. Experimental results reveal that the proposed algorithm can accelerate the QTBT block partition structure by reducing the intra-mode encoding time by an average of 35% with a bit rate increase of 1.7%, allowing its application in practical scenarios.
In this paper, a novel fast coding unit depth decision algorithm based on convolution neural network is presented for JVET future video coding. JVET employs quad-tree plus binary-tree (QTBT) block partitioning structu...
详细信息
ISBN:
(纸本)9781538604625
In this paper, a novel fast coding unit depth decision algorithm based on convolution neural network is presented for JVET future video coding. JVET employs quad-tree plus binary-tree (QTBT) block partitioning structure, which can support much more flexibility for coding units partition shapes, and improve the coding performance significantly than the HEVC standard. However, the flexible partitioning structure also introduces a tremendous computation complexity. To address this issue, we model the QTBT partition depth range as a multi-class classification problem, and try to predict the depth range of 32x32 block directly, rather than to judge split or not at each depth level. To the best of our knowledge, it is the first framework to formulate the QTBT partition range as a multi classification task, and optimized by an end-to-end learning model. For training optimization, we design an objective function consists of class penalty term and L2 HingeLoss function, which leverage the characteristics of category settings, can further boost the classification accuracy. Experimental results demonstrate the effectiveness of our proposed method, which can achieve 42.80% complexity reduction with only 0.65% Bjontegaard Delta bitrate (BD-rate) increase.
暂无评论