Context-based Adaptive Binary Arithmetic Coding (CABAC) is the only compute-intensive task in the High Efficiency Video Coding (HEVC) Standard that does not contain significant data-level parallelism. As a result, it ...
详细信息
ISBN:
(纸本)9781728165820
Context-based Adaptive Binary Arithmetic Coding (CABAC) is the only compute-intensive task in the High Efficiency Video Coding (HEVC) Standard that does not contain significant data-level parallelism. As a result, it is often a throughput bottleneck for the overall decoding process, especially for high-quality videos. Consequently, the use of high-level parallelization techniques is inevitable to reach throughput requirements for CABAC decoding. Multiple high-level parallelization tools are specified in HEVC, amongst which wavefront parallel processing (WPP) has only small losses in coding efficiency. However, it lacks in parallel efficiency due to a ramp-up and -down in active parallel threads within a frame. This is a serious problem for systems that cannot process multiple frames at the same time due to performance or memory constraints (e.g. mobile devices), and also for low-delay applications such as video conferencing. To address this issue, we present three improved WPP implementations for HEVC CABAC decoding. They differ in the granularity at which dependency checks are performed. The improvement comes from increased parallel efficiency of the WPP implementation while using the same number of threads as conventional WPP. The proposed implementations allow speedups up to 1.83 x with very little implementation overhead.
In order to meet the high computational demand to achieve superior coding efficiency and to explore the parallelism of parallelprocessing architectures, the emerging high efficiency video coding (HEVC) standard has b...
详细信息
In order to meet the high computational demand to achieve superior coding efficiency and to explore the parallelism of parallelprocessing architectures, the emerging high efficiency video coding (HEVC) standard has been designed to be more parallelizable than previous video coding standards. However, it is still desirable to design an efficient parallel HEVC encoder to fully exploit the parallelism of the increasingly powerful multicore platforms, especially when considering the amount of parallelism, the scalability of parallelization, and the coding efficiency. In this work, a performance model of HEVC encoding is first introduced to investigate the speedup and the limitations of the technique of wavefront parallel processing (WPP) under various conditions. Then, a collaborative scheduling-based parallel solution (CSPS) for HEVC encoding is proposed, which includes adaptive parallel mode decision, asynchronous frame-level pixel interpolation, and multigrained task scheduling. The goal of the proposed CSPS is to defeat the disadvantages of WPP and further improve the parallelization of HEVC encoding on multicore platforms. Extensive experimental results demonstrate the efficiency of the proposed CSPS for parallelizing HEVC encoding as the computing resources of multicore architectures can be fully utilized.
Thus far, Multiview High-Efficiency Video Coding (MV-HEVC) can only use a central processing unit (CPU) to perform decompression on a personal computer (PC) or workstation. Because MV-HEVC is much more complex than Hi...
详细信息
Thus far, Multiview High-Efficiency Video Coding (MV-HEVC) can only use a central processing unit (CPU) to perform decompression on a personal computer (PC) or workstation. Because MV-HEVC is much more complex than High-Efficiency Video Coding (HEVC), decompressors need higher parallelism to decompress in real time. Therefore, this study presents a parallel method based on MV-HEVC. Interview complete parallelism is realized according to the dependent relationship between other MV-HEVC views, and a search range is not required to avoid the data dependence between frames. Based on the dependencies of each task in MV-HEVC, an advanced wavefront parallel processing method is proposed to achieve higher intra-frame parallelism. The parallel structure of the proposed method is compatible with that of the single-instruction multiple-data acceleration method. The results showed that the proposed method can decompress MV-HEVC with 20 threads in real time for 1088p video with three views. (C) 2020 Elsevier Inc. All rights reserved.
In order to efficiently transmit video data while satisfying the channel bandwidth and transmission delay constraints, bit rate control of the video encoding process is required. According to ultra-high-definition vid...
详细信息
ISBN:
(纸本)9789811381386;9789811381379
In order to efficiently transmit video data while satisfying the channel bandwidth and transmission delay constraints, bit rate control of the video encoding process is required. According to ultra-high-definition video, traditional coding algorithms have a large amount of caculation and high computational complexity, thus, parallel coding methods such as inter-frame parallel coding and wavefrontparallel coding (WPP) are proposed. However, the rate control of parallel coding is a difficult problem, especially the intra-frame rate control under the WPP coding mode, so this paper proposes a bit rate control algorithm within macro-block layer. By contrasting the PSNR, the encoding speed and the VBV (Video Buffer Verifier) buffer condition of the video sequences, the algorithm proposed in this paper has advantages of less computing cost and faster coding speed than the traditional algorithms.
Although wavefront parallel processing (WPP) proposed in the HEVC standard and various inter frame WPP algorithms can achieve comparatively high parallelism, their scalability for its parallelism is still very limited...
详细信息
Although wavefront parallel processing (WPP) proposed in the HEVC standard and various inter frame WPP algorithms can achieve comparatively high parallelism, their scalability for its parallelism is still very limited...
详细信息
ISBN:
(纸本)9781479999897
Although wavefront parallel processing (WPP) proposed in the HEVC standard and various inter frame WPP algorithms can achieve comparatively high parallelism, their scalability for its parallelism is still very limited due to various dependencies introduced in spatial and temporal prediction in HEVC. In this paper, we propose three types of 3 Dimensional WPP (3D-WPP) algorithms that can significantly improve the parallelism, while achieving good tradeoffs between implementation complexity, determinism, and rate-distortion (RD) performance. Experimental results show that the proposed algorithms can lead to up to 2.8x speed up compared with existing inter frame WPP methods. While the Simple 3D-WPP and Static 3D-WPP algorithm may introduce an BD rate loss between 0 to 4.9% as compared with existing algorithms, the more complex Dynamic 3D-WPP algorithm achieves better parallelism with virtually no coding performance loss.
暂无评论