Cross-component prediction is an important intra-prediction tool in the modern video coders. Existing prediction methods to exploit cross-component correlation include cross-component linear model and its extension of...
详细信息
Cross-component prediction is an important intra-prediction tool in the modern video coders. Existing prediction methods to exploit cross-component correlation include cross-component linear model and its extension of multi-model linear model. These models are designed for camera captured content. For screen content coding, where videos exhibit different signal characteristics, a cross-component prediction model tailored to their characteristics is desirable. As a pioneering work, we propose a discrete-mapping based cross-component prediction model for screen content coding. Our model relies on the core observation that, screencontent videos typically comprise of regions with a few distinct colors and luma value (almost always) uniquely conveys chroma value. Based on this, the proposed method learns a discrete-mapping function from available reconstructed luma-chroma pairs and uses this function to derive chroma prediction from the co-located luma samples. To achieve higher accuracy, a multi-filter approach is employed to derive co-located luma values. The proposed method achieves 2.61%, 3.51% and 3.92% Y, U and V bit-rate savings respectively over Enhanced Compression Model (ECM) 4.0, with negligible complexity, for text and graphics media under all-intra configuration.
One of the design goals of the recently published international video coding standard, Versatile Video coding (VVC/H.266), is efficient coding of computer-generated video content (commonly referred to as screen conten...
详细信息
One of the design goals of the recently published international video coding standard, Versatile Video coding (VVC/H.266), is efficient coding of computer-generated video content (commonly referred to as screencontent) which exhibits different signal characteristics from the usual camera-captured video (commonly referred as natural content). VVC can perform transform in multiple different ways including skipping the transform itself, which demands much computation for its best selection among many combinatory options. In this paper, we investigate designing a machine-learning-based early transform skip mode decision (ML-TSM) which makes a determination whether or not to skip the transform in an early stage by making a simple classification employing key features designed in such a way to reflect the characteristics of TSM blocks well. Compared with the VVC reference software 14.0, the proposed scheme is verified to reduce computational complexity by 11% and 4% with a Bjontegaard delta bitrate (BDBR) increase of 0.34% and 0.23% respectively under all-intra (AI) and random-access (RA) configurations.
String Prediction (SP) is a very efficient screen content coding (SCC) tool. In SP, the self-referencing string plays an important role to improve coding efficiency. But general self-referencing string has the problem...
详细信息
String Prediction (SP) is a very efficient screen content coding (SCC) tool. In SP, the self-referencing string plays an important role to improve coding efficiency. But general self-referencing string has the problem of very low pixel copying throughput and is prohibited in the non-self-referencing based SP which has been adopted in the third-generation Audio Video Standard (AVS3). To overcome the problem and bring back the coding gain of self-referencing string, a line-based self-referencing string (LSRS) enabled SP technique is proposed. Moreover, to keep the pixel copying throughput and coding complexity of LSRS enabled SP the same as non-self-referencing based SP, an unbroken-line decomposition algorithm is presented to decompose an LSRS into multiple non-self-referencing strings. In this way, LSRS can be treated in the same way as a non-self-referencing string with the best trade-off between coding efficiency and complexity. Compared with non-self-referencing based SP, using AVS3 reference software HPM, for twelve SCC common test condition YUV test sequences in text and graphics with motion category and mixed content category, the proposed LSRS technique achieves the average Y BD-rate reduction of 0.81% and 0.59% as well as the maximum Y BD-rate reduction of 2.04% and 1.31% for All Intra and Low Delay configurations, respectively, with almost no additional encoding and decoding complexity. The proposed LSRS enabled SP technique has been adopted in AVS3.
Intra Block Copy (IBC) and Intra Template Matching Prediction (IntraTMP) are two efficient algorithms to sufficiently exploit the correlation in the same picture. Block Vector (BV) is used to represent the displacemen...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Intra Block Copy (IBC) and Intra Template Matching Prediction (IntraTMP) are two efficient algorithms to sufficiently exploit the correlation in the same picture. Block Vector (BV) is used to represent the displacement between the current block and its reference within the same picture. The BV information of luma can be employed to help the chroma coding efficiently. Based on this feature, an adaptive chroma prediction is proposed to derive the BV of the chroma block from the luma. Two strategies are designed to improve the coding performance, including multiple positions' check and template-based BV refinement. Compared with Enhanced Compression Model (ECM) of beyond VVC, 0.43%, 0.35%, and 0.60% BD-rate savings for Y, Cb, and Cr components are achieved for Class F, and 2.23%, 2.31%, and 2.93% BD-rate savings are provided for Class TGM. We also integrated the proposed method into the VVC Test Model (VTM). A similar coding improvement can be observed. Due to the coding gain and low complexity, the proposed method has been adopted into the beyond VVC exploration and integrated into the latest version of ECM.
In recent years, computer-generated texts, graphics, and animations have drawn more attention than ever. These types of media, also known as screencontent, have become increasingly popular due to their widespread app...
详细信息
In recent years, computer-generated texts, graphics, and animations have drawn more attention than ever. These types of media, also known as screencontent, have become increasingly popular due to their widespread applications. To address the need for efficient coding of such content, several coding tools have been developed and have made great advances in terms of coding efficiency. The inclusion of screen content coding features in some recently developed video coding standards (namely, HEVC SCC, VVC, AVS3, AV1 and EVC) demonstrates the importance of supporting such features. This paper provides an overview and comparative study of screen content coding technologies, as well as discussions on the performance and complexity of the tools developed in these standards.
screencontents have become a popular image type driven by the growing market for transferring display screen between devices, especially mobile devices. Due to the ultra-high quality display featured in most of nowad...
详细信息
screencontents have become a popular image type driven by the growing market for transferring display screen between devices, especially mobile devices. Due to the ultra-high quality display featured in most of nowadays mobile devices, lossless screen content coding (SCC) is usually required or preferred. Mobile devices also require ultra-low power consumption in all tasks including SCC. To address these issues, this paper proposes an ultra-low coding complexity technique based on string matching for high efficiency lossless SCC. The technique covers three major coding phases of fast searching, prediction, and entropy coding. Condensed hash table (CHT) based fast searching is proposed to speed-up reference string searching process. Coplanar prediction (CP) and predictor-dependent residual (PDR) are presented to first efficiently predict an unmatchable pixel using multiple neighboring pixels and then further reduce the entropy of prediction residuals. To achieve a good trade-off between coding complexity and efficiency, 4-bit-aligned variable length code (4bVLC) and byte-aligned multi-variable-length-code (BMVLC) are proposed to code the prediction residuals and three string matching parameters, respectively. For 184 screencontent images commonly used, compared with X265 and PNG in the default configuration and lossless mode, the proposed technique achieves 35.67% less total compressed bytes with only 0.96% encoding and 1.54% decoding runtime, and 10.04% less total compressed bytes with only 6.83% encoding and 24.32% decoding runtime, respectively. The proposed technique also outperforms X265 and PNG in all other configurations. For twelve HEVC-SCC CTC images, compared with PNG in fast, default and slow configurations and X265 in ultrafast and default configurations, the proposed technique shows significant advantage with both high coding efficiency and ultra-low coding complexity.
The Video coding Joint Collaboration team (JCT-VC) has been working on an emerging standard for screen content coding (SCC) as an extension of high efficiency video coding (HEVC) standard known as HEVC-SCC. The two po...
详细信息
The Video coding Joint Collaboration team (JCT-VC) has been working on an emerging standard for screen content coding (SCC) as an extension of high efficiency video coding (HEVC) standard known as HEVC-SCC. The two powerful coding mechanisms used in HEVC-SCC are intra block copy (IBC) and palette coding (PLT). These techniques achieve the best coding efficiency at the expense of extremely high computational complexity. Therefore, we propose a new technique to minimize computational complexity by skipping undesired modes and retaining coding efficiency. A fast intra mode decision approach is suggested based on efficient CU classification. Our proposed solution depends on categorizing a CU as a natural content block (NCB) or a screencontent block (SCB). Two classifiers are used for the classification process. The first one is a neural network (NN) classifier, and the other is an AdaBoost classifier, which depends on a boosted decision stump algorithm. The two classifiers predict the CU type individually and the final decision for CU classification depends on both of them. The experimental results reveal that the suggested technique significantly decreases encoding time without sacrificing coding efficiency. The suggested framework can achieve a 26.13% encoding time reduction on average with just a 0.81% increase in Bjontegaard Delta bit-rate (BD-Rate). Furthermore, the suggested framework saves encoding time by 51.5% on average for a set of NC sequences recommended for standard HEVC tests with minimal performance degradation. The proposed strategy has been merged with an existing methodology to accelerate the process even further.
Driven by growing applications that use computer screens as interfaces for daily remote interactions, almost all current video coding standards have included screen content coding (SCC) tools. Recently, an efficient S...
详细信息
Driven by growing applications that use computer screens as interfaces for daily remote interactions, almost all current video coding standards have included screen content coding (SCC) tools. Recently, an efficient SCC tool called intra string copy (ISC) has been adopted in the third-generation of audio video coding standard in China (AVS3). ISC has two coding unit (CU) level sub-modes: fully-matching-string and partially-matching-string based string prediction (FPSP) sub-mode and equal-value-string, unit-basis-vector-string, and unmatched-pixel-string based string prediction (EUSP) sub-mode. To further improve the coding efficiency of SCC, this paper proposes four enhancement techniques of ISC (EISC), including CU partition improvements, point vector (PV) relocation and reactivation, line-based overlapping string prediction, and an optimized coding method for string length in the EUSP sub-mode. Compared with the latest AVS3 reference software HPM with EISC disabled, using AVS3 SCC common test condition and YUV test sequences in text and graphics with motion and mixed content categories, the proposed technique achieves an average Y BD-rate reduction of 2.39% and 1.49% for all intra (AI) and low-delay B (LDB) configurations, respectively, with low additional encoding complexity and almost no additional decoding complexity. All proposed ISC enhancement techniques have been adopted in AVS3.
Current video coding schemes such as VVC and ECM employ separate palette coding for luma and chroma components under dual-tree structure, ignoring cross-component correlations. Although there are linear and multi-moda...
详细信息
ISBN:
(纸本)9781665492577
Current video coding schemes such as VVC and ECM employ separate palette coding for luma and chroma components under dual-tree structure, ignoring cross-component correlations. Although there are linear and multi-modal linear models to capture cross-component correlations, such models are not tailored for screencontent sequences. To address this, we propose a novel cross-component prediction model for screencontent sequences. The proposed method builds on the core observation that, regions of screencontent sequences comprise of few distinct colors and luma value (almost always) uniquely conveys chroma values. In the light of this observation, the proposed method derives chroma prediction based on a discrete mapping function between luma and chroma values. Specifically, the method simply remembers the reconstructed luma values and their corresponding chroma values in a look-up table and employs this look-up table for cross-component prediction for the current chroma block. To achieve higher accuracy, a multi-filter approach is employed to derive co-located luma values. For an example configuration, the proposed method achieves 1.37%, 1.08% and 1.68% Y, U and V bit-rate savings respectively over ECM 3.1, for text and graphics media under all-intra configuration, demonstrating its efficacy.
In recent years, screencontent video is becoming increasingly popular in several major video applications, such as video recording and video conferencing. Due to the unique features of screencontent videos that are ...
详细信息
ISBN:
(纸本)9781665492577
In recent years, screencontent video is becoming increasingly popular in several major video applications, such as video recording and video conferencing. Due to the unique features of screencontent videos that are not captured by camera sensors but produced artificially, dedicated coding tools have been developed for achieving significant compression efficiency gain. In recognition of the popularity of screencontent applications, an open video dataset for screencontent is proposed in this paper for the development of screen content coding technologies. The proposed video dataset consists of 12 typical screencontent type video clips that are publicly available. In addition, to better understand the characteristics of the proposed video dataset, several major screen content coding tools in AOMedia Video 1 (AV1) have been evaluated on this dataset and analyzed in this paper.
暂无评论