Three-dimensional (3d) video technology has gained immense admiration in recent times due to its numerous applications, particularly in the television and cinema industry. Three-dimensional television (3dTV) and free-...
详细信息
Three-dimensional (3d) video technology has gained immense admiration in recent times due to its numerous applications, particularly in the television and cinema industry. Three-dimensional television (3dTV) and free-viewpoint television (FTV) are two well-known applications that provide the end-user with a real-world and high-quality 3ddisplay. In both applications, multiple views captured from different viewpoints are rendered simultaneously to offer depth sensation to the viewer. A large number of views are needed to enable FTV. However, transmitting this massive amount of data is challenging due to bandwidth limitations. Multiview video-plus-depth (MVd) is the most popular format where in addition to color images, corresponding depth information is also available which represents the scene geometry. The MVd format with the help of depth image-based rendering (dIBR) enables the generation of views at novel viewpoints. In this paper, we introduce a panorama-based representation of MVddata with an efficient keyframe-baseddisocclusions handling technique. The panorama view for a stereo pair with depth is constructed from the left view and the novel appearing region of the right view which is not visible from the left viewpoint. The disocclusions that appear in the right view when obtained from the dIBR of the left view are collected in a special frame named as keyframe. On the decoder side, the left view is available with a simple crop of panorama view. The right view is obtained through dIBR of the left view combined with the appearing region from the panorama view. The disocclusions in this warped view are filled from the keyframe. The panorama view with additional keyframes and the corresponding depth map are compressed using the standard HEVC codec. The experimental evaluations performed on standard MVd sequences showed that the proposed scheme achieves excellent video quality while saving considerable bit rate compared to HEVC simulcast.
This paper designs a novel method to reduce the coding complexity of 3d-HEVC encoder by utilizing the properties of human visual perception. Two vision-oriented edge detections are proposed: for colour texture detecti...
详细信息
This paper designs a novel method to reduce the coding complexity of 3d-HEVC encoder by utilizing the properties of human visual perception. Two vision-oriented edge detections are proposed: for colour texture detection, the authors adopt the Just-Noticeable distortion (JNd);for depth map, the authors combine the Sample Adaptive Offset (SAO) and the Just Noticeable depth difference (JNdd) model. The authors also analyse the properties of colour texture anddepth map to classify the coding tree unit (CTU) into various kinds of types, including complex-edge CTU, moderate-edge CTU and homogeneous CTU. Besides, fast mode decisions and early termination criteria are performed individually on each type of CTUs according to their characteristics. Especially for those CTUs with more edge information, the proposed projection-based fast mode decision and residual-based early termination preserve important colour texture while speeding up the coding at the same time. The proposed vision-oriented algorithm reduces 31.981% of the overall average coding time with only 1.580% Bd-Bitrate increase. Experimental results show that the proposed algorithm can provide considerable time-saving while still maintain the video quality, which outperforms the previous researches.
Efficient lossless coding of a texture image and its corresponding depth map is important to perform accurate view synthesis in 3d applications. In this paper, a novel HEVC-based three-layer texture anddepth coding m...
详细信息
Efficient lossless coding of a texture image and its corresponding depth map is important to perform accurate view synthesis in 3d applications. In this paper, a novel HEVC-based three-layer texture anddepth coding method is proposed for lossless synthesis in 3d video coding. The proposed method performs lossy and lossless texture coding in the first and second layers, respectively. A quantization parameter (QP) in the first lossy coding layer is adaptively selected for each block, based on a relationship between the first and second layers. In the third layer, the lossy depth coding is performed by using synthesis-baseddepth coding and texture-baseddepth intra prediction mode. The synthesis-baseddepth coding technique adopts the lossy coding but guarantees zero synthesis distortion. The texture-baseddepth intra prediction mode performs the depth prediction by using the associated texture information. Experimental results demonstrate that the proposed method obtains higher coding performance than conventional lossless coding methods.
Recently, region-based3d video coding has been proposed. However, existing view synthesis distortion estimation (VSdE) methods are performed at the frame level. To guide the rate-distortion optimization process of re...
详细信息
Recently, region-based3d video coding has been proposed. However, existing view synthesis distortion estimation (VSdE) methods are performed at the frame level. To guide the rate-distortion optimization process of region-based3d video coding schemes, this paper proposes the first pixel-level VSdE (PL-VSdE) method. We first give the definition of the pixel-level view synthesis distortion. To estimate it, a backward prediction method is then developed, which starts from the pixels of interest (POIs) in the virtual view and finds their corresponding pixels in the reference view via a coarse-to-fine approach, denoted as coarse-to-fine backward prediction (CFBP) method. Additionally, the CFBP fully considers the details of 3d warping, the rounding operation and the warping competition in view synthesis, leading to improve accuracy of the prediction. Besides, a table-lookup method and a warping property are introduced to speed up the CFBP. After integrating the CFBP into the PL-VSdE, we can estimate the view synthesis distortion at the pixel level. Our method is carried out pixel-by-pixel independently, which is friendly for parallel processing. The experimental results demonstrate that our proposed method has significant advantages in both accuracy and efficiency compared with the state-of-the-art frame-level VSdE methods.
In the multi-view video plus depth 3d video coding, texture image anddepth map are coded jointly. The texture image is utilized for displaying and synthesizing the virtual view as reference image. The depth map provi...
详细信息
In the multi-view video plus depth 3d video coding, texture image anddepth map are coded jointly. The texture image is utilized for displaying and synthesizing the virtual view as reference image. The depth map provides the scene geometry information and is utilized to synthesize the virtual view at the terminal through depth-Image Based Rendering technique. The distortion of the compressed texture image anddepth map will be propagated to the synthesized virtual view. Besides the coding efficiency of texture image anddepth map, bit allocation between texture image anddepth map also has a great effect on the synthesized virtual view quality. Several methods are proposed for bit allocation between texture image anddepth map, but most of them attempt to allocate a fixed target bitrate based on virtual view distortion model to achieve optimal synthesized virtual view quality, and the modeling process brings extra complexity. In practical application, the video sequence has different contents and fixed bit ratio cannot achieve optimal performance. In this paper, we propose an adaptive bit allocation algorithm for 3d video coding. First, we present a model to estimate the synthesized virtual view distortion, and then adjust the bit ratio between adjacent views and between texture image anddepth map at Group of Picture level based on the virtual view quality fluctuation. We adjust the bit ratio to achieve the optimal virtual view quality for different video contents. Experimental results demonstrate that the proposed algorithm can optimally allocate bits to achieve optimal virtual view quality under different target bitrates and for different video contents, and the computational complexity of the proposed algorithm is extremely low.
Multi-view video plus depth (MVd) format 3dvideo consists of color texture image and gray depth map, the depth map provides the scene geometry information and is utilized to synthesize the virtual views through depth...
详细信息
Multi-view video plus depth (MVd) format 3dvideo consists of color texture image and gray depth map, the depth map provides the scene geometry information and is utilized to synthesize the virtual views through depth Image Based Rendering (dIBR) technique. The quality of the synthesized virtual views is related to the qualities of both texture image anddepth map, thus bit allocation between texture image anddepth map is very important in 3d video coding. In this paper, we propose to optimally allocate bits between texture image anddepth map by adjusting the Lagrangian Multiplier (LM) in depth map coding, we adjust the LM based on the difference between texture image coding Quantization Parameter (QP) anddepth map QP. Experimental results show that the proposed method can achieve optimal 3d video coding performance for different sequences under different bitrates, and the complexity of our method is extremely low.
Stereoscopic images andvideo are synthesized from virtual views, which are produced from a real view and its depth map. However, this process involves a large amount of data which not only increases the transmission ...
详细信息
Stereoscopic images andvideo are synthesized from virtual views, which are produced from a real view and its depth map. However, this process involves a large amount of data which not only increases the transmission load, but also must be stored within the display devices which therefore must possess huge storage capacities. To solve these problems, the Joint video Team has developed the 3d video coding (3dVC) standard to reduce the redundant data for multiview video and its depth map through the correlation between the real view and its corresponding depth map. When depth data in the 3dVC bitstream are lost during transmission, it causes the decodeddepth map to be incomplete. Then the quality of the decodeddepth map is reduced and the quality of the stereoscopic video will also be decreased. To increase the reconstructed quality of the damaged macro-block, this paper proposes an adaptive error concealment selection to determine the most suitable error concealment method based on fuzzy reasoning. The temporal and spatial correlations of the damaged macro-block which are defined by the motion vectors of neighboring blocks and their depth information are used for the adaptive selection. From the simulation result, the proposed algorithm can reduce the concealing time by at least 0.15-2.86 frame/sec and can improve quality by about 0.13-0.17 dB. The proposed algorithm is suitable for the 3dVC video transmission.
coding efficiency can be enhanced through rate-distortion optimization (RdO) that provides a trade-off between bit-rate anddistortion. In this paper, we have proposed Structural SIMilarity (SSIM) based RdO for 3d vid...
详细信息
ISBN:
(纸本)9781509048472
coding efficiency can be enhanced through rate-distortion optimization (RdO) that provides a trade-off between bit-rate anddistortion. In this paper, we have proposed Structural SIMilarity (SSIM) based RdO for 3d video coding improvement. SSIM index is a quality metric that gives better approximation to visual quality. Most of the existing literature on 3d video coding employs sum-of-squared error (SSE) as a measure of distortion, which does not always correlate to visual quality. In order to overcome this gap, SSIM-based RdO is implemented in this paper. Lagrange multiplier is modified to obtain optimum rate along with a reduction in distortion which improves the perceptual quality of the video. The entropy of macroblock (MB) is also considered in the scaling of Lagrange multiplier to increase RdO performance. The proposed algorithm is implemented in 3dV-ATM reference software. Experimental results show an improvement in the perceptual quality of the synthesized sequences with bitrate reduction of 6 - 15%.
Just noticeable depth difference (JNdd) model represents the depth sensitivity of human visual system (HVS). Human eyes have different depth sensitivity for different distance and binocular disparity information. The ...
详细信息
ISBN:
(纸本)9781467386609
Just noticeable depth difference (JNdd) model represents the depth sensitivity of human visual system (HVS). Human eyes have different depth sensitivity for different distance and binocular disparity information. The JNdd model indicates that human can easily perceive the depth change when an object is close to the screen plane, but less sensitive to a distant object. If the depth information is below the JNdd threshold, there are visual redundancies, and we can process a video with bigger compression ratio. In this paper, we apply Silva's JNdd model to H.265/HEVC for the color image coding. We propose a perception region division algorithm (PRdA), through which we divide the depth map into four regions with different depth. Then, the quantization parameter is adjusteddepending on a visual perception threshold. Experimental results show that our color image coding scheme improves the image quality without increasing bitrates.
The future of novel 3ddisplay technologies largely depends on the design of efficient techniques for 3dvideo representation andcoding. Recently, multiple view plus depth video formats have attracted many research e...
详细信息
The future of novel 3ddisplay technologies largely depends on the design of efficient techniques for 3dvideo representation andcoding. Recently, multiple view plus depth video formats have attracted many research efforts since they enable intermediate view estimation and permit to efficiently represent and compress 3dvideo sequences. In this paper, we present spatiotemporal occlusion compensation with panorama view (STOP), a novel 3d video coding technique based on the creation of a panorama view and occlusion coding in terms of spatiotemporal offsets. The panorama picture represents the most of the visual information acquired from multiple views using a single virtual view, characterized by a larger field of view. Encoding the panorama video with state-of-the-art HECV and representing occlusions with simple spatiotemporal ancillary information STOP achieves high-compression ratio and good visual quality with competitive results with respect to competing techniques. Moreover, STOP enables free viewpoint 3d TV applications whilst allowing legacy display to get a bidimensional service using a standardvideo codec and simple cropping operations.
暂无评论