Three-dimensional (3d) video technology has gained immense admiration in recent times due to its numerous applications, particularly in the television and cinema industry. Three-dimensional television (3dTV) and free-...
详细信息
Three-dimensional (3d) video technology has gained immense admiration in recent times due to its numerous applications, particularly in the television and cinema industry. Three-dimensional television (3dTV) and free-viewpoint television (FTV) are two well-known applications that provide the end-user with a real-world and high-quality 3ddisplay. In both applications, multiple views captured from different viewpoints are rendered simultaneously to offer depth sensation to the viewer. A large number of views are needed to enable FTV. However, transmitting this massive amount of data is challenging due to bandwidth limitations. Multiview video-plus-depth (MVd) is the most popular format where in addition to color images, corresponding depth information is also available which represents the scene geometry. The MVd format with the help of depth image-based rendering (dIBR) enables the generation of views at novel viewpoints. In this paper, we introduce a panorama-based representation of MVddata with an efficient keyframe-baseddisocclusions handling technique. The panorama view for a stereo pair with depth is constructed from the left view and the novel appearing region of the right view which is not visible from the left viewpoint. The disocclusions that appear in the right view when obtained from the dIBR of the left view are collected in a special frame named as keyframe. On the decoder side, the left view is available with a simple crop of panorama view. The right view is obtained through dIBR of the left view combined with the appearing region from the panorama view. The disocclusions in this warped view are filled from the keyframe. The panorama view with additional keyframes and the corresponding depth map are compressed using the standard HEVC codec. The experimental evaluations performed on standard MVd sequences showed that the proposed scheme achieves excellent video quality while saving considerable bit rate compared to HEVC simulcast.
Just noticeable depth difference (JNdd) model represents the depth sensitivity of human visual system (HVS). Human eyes have different depth sensitivity for different distance and binocular disparity information. The ...
详细信息
ISBN:
(纸本)9781467386609
Just noticeable depth difference (JNdd) model represents the depth sensitivity of human visual system (HVS). Human eyes have different depth sensitivity for different distance and binocular disparity information. The JNdd model indicates that human can easily perceive the depth change when an object is close to the screen plane, but less sensitive to a distant object. If the depth information is below the JNdd threshold, there are visual redundancies, and we can process a video with bigger compression ratio. In this paper, we apply Silva's JNdd model to H.265/HEVC for the color image coding. We propose a perception region division algorithm (PRdA), through which we divide the depth map into four regions with different depth. Then, the quantization parameter is adjusteddepending on a visual perception threshold. Experimental results show that our color image coding scheme improves the image quality without increasing bitrates.
An overview of existing and upcoming 3d video coding standards is given. Various different 3dvideo formats are available, each with individual pros and cons. The 3dvideo formats can be separated into two classes: vi...
详细信息
ISBN:
(纸本)9780819482341
An overview of existing and upcoming 3d video coding standards is given. Various different 3dvideo formats are available, each with individual pros and cons. The 3dvideo formats can be separated into two classes: video-only formats (such as stereo and multiview video) anddepth-enhanced formats (such as video plus depth and multiview video plus depth). Since all these formats exist of at least two video sequences and possibly additional depth data, efficient compression is essential for the success of 3dvideo applications and technologies. For the video-only formats the H.264 family of coding standards already provides efficient and widely established compression algorithms: H.264/AVC simulcast, H.264/AVC stereo SEI message, and H.264/MVC. For the depth-enhanced formats standardizedcoding algorithms are currently being developed. New and specially adaptedcoding approaches are necessary, as the depth or disparity information included in these formats has significantly different characteristics than video and is not displayeddirectly, but used for rendering. Motivated by evolving market needs, MPEG has started an activity to develop a generic 3dvideo standard within the 3dVC ad-hoc group. Key features of the standard are efficient and flexible compression of depth-enhanced3dvideo representations anddecoupling of content creation anddisplay requirements.
The emergence of three dimensional (3d) video applications, based on depth Image Based Rendering (dIBR) has brought lip more requirements of bandwidth, due to the need of depth information. This additional bandwidth r...
详细信息
The emergence of three dimensional (3d) video applications, based on depth Image Based Rendering (dIBR) has brought lip more requirements of bandwidth, due to the need of depth information. This additional bandwidth requirement need to be tackled to enable the widespread of 3dvideo applications based on dIBR. Exploiting visual correlations between the color image and the depth image, in depth image coding, will reduce the requirement of high bandwidth required to transmit the additional depth information. In this paper, an object baseddepth image coding technique is presented which is suitable for low bit rate 3d-TV applications that are based on depth Image Based Rendering. The proposed method achieves at most 50% bit rate reduction at low bit rates.
3d video coding for transmission exploits the disparity Estimation (dE) to remove the inter-view redundancies present within both the texture and the depth map multi-view videos. Good estimation accuracy can be achiev...
详细信息
3d video coding for transmission exploits the disparity Estimation (dE) to remove the inter-view redundancies present within both the texture and the depth map multi-view videos. Good estimation accuracy can be achieved by partitioning the macro-block into smaller sub-blocks partitions. However, the dE process must be performed on each individual sub-block to determine the optimal mode and their disparity vectors, in terms of rate-distortion efficiency. This vector estimation process is heavy on computational resources, thus, the coding computational cost becomes proportional to the number of search points and the inter-view modes testedduring the rate-distortion optimization. In this paper, a solution that exploits the available depth map data, together with the multi-view geometry, is proposed to identify a better dE search area;such that it allows a reduction in its search points. It also exploits the number of different depth levels present within the current macro-block to determine which modes can be used for dE to further reduce its computations. Simulation results demonstrate that this can save up to 95% of the encoding time, with little influence on the coding efficiency of the texture and the depth map multi-view videocoding. This makes 3d video coding more practical for any consumer devices, which tend to have limited computational power(1).
In the multi-view video plus depth 3d video coding, texture image anddepth map are coded jointly. The texture image is utilized for displaying and synthesizing the virtual view as reference image. The depth map provi...
详细信息
In the multi-view video plus depth 3d video coding, texture image anddepth map are coded jointly. The texture image is utilized for displaying and synthesizing the virtual view as reference image. The depth map provides the scene geometry information and is utilized to synthesize the virtual view at the terminal through depth-Image Based Rendering technique. The distortion of the compressed texture image anddepth map will be propagated to the synthesized virtual view. Besides the coding efficiency of texture image anddepth map, bit allocation between texture image anddepth map also has a great effect on the synthesized virtual view quality. Several methods are proposed for bit allocation between texture image anddepth map, but most of them attempt to allocate a fixed target bitrate based on virtual view distortion model to achieve optimal synthesized virtual view quality, and the modeling process brings extra complexity. In practical application, the video sequence has different contents and fixed bit ratio cannot achieve optimal performance. In this paper, we propose an adaptive bit allocation algorithm for 3d video coding. First, we present a model to estimate the synthesized virtual view distortion, and then adjust the bit ratio between adjacent views and between texture image anddepth map at Group of Picture level based on the virtual view quality fluctuation. We adjust the bit ratio to achieve the optimal virtual view quality for different video contents. Experimental results demonstrate that the proposed algorithm can optimally allocate bits to achieve optimal virtual view quality under different target bitrates and for different video contents, and the computational complexity of the proposed algorithm is extremely low.
Stereoscopic images andvideo are synthesized from virtual views, which are produced from a real view and its depth map. However, this process involves a large amount of data which not only increases the transmission ...
详细信息
Stereoscopic images andvideo are synthesized from virtual views, which are produced from a real view and its depth map. However, this process involves a large amount of data which not only increases the transmission load, but also must be stored within the display devices which therefore must possess huge storage capacities. To solve these problems, the Joint video Team has developed the 3d video coding (3dVC) standard to reduce the redundant data for multiview video and its depth map through the correlation between the real view and its corresponding depth map. When depth data in the 3dVC bitstream are lost during transmission, it causes the decodeddepth map to be incomplete. Then the quality of the decodeddepth map is reduced and the quality of the stereoscopic video will also be decreased. To increase the reconstructed quality of the damaged macro-block, this paper proposes an adaptive error concealment selection to determine the most suitable error concealment method based on fuzzy reasoning. The temporal and spatial correlations of the damaged macro-block which are defined by the motion vectors of neighboring blocks and their depth information are used for the adaptive selection. From the simulation result, the proposed algorithm can reduce the concealing time by at least 0.15-2.86 frame/sec and can improve quality by about 0.13-0.17 dB. The proposed algorithm is suitable for the 3dVC video transmission.
Efficient lossless coding of a texture image and its corresponding depth map is important to perform accurate view synthesis in 3d applications. In this paper, a novel HEVC-based three-layer texture anddepth coding m...
详细信息
Efficient lossless coding of a texture image and its corresponding depth map is important to perform accurate view synthesis in 3d applications. In this paper, a novel HEVC-based three-layer texture anddepth coding method is proposed for lossless synthesis in 3d video coding. The proposed method performs lossy and lossless texture coding in the first and second layers, respectively. A quantization parameter (QP) in the first lossy coding layer is adaptively selected for each block, based on a relationship between the first and second layers. In the third layer, the lossy depth coding is performed by using synthesis-baseddepth coding and texture-baseddepth intra prediction mode. The synthesis-baseddepth coding technique adopts the lossy coding but guarantees zero synthesis distortion. The texture-baseddepth intra prediction mode performs the depth prediction by using the associated texture information. Experimental results demonstrate that the proposed method obtains higher coding performance than conventional lossless coding methods.
3dvideo services are emerging in various application domains including cinema, TV broadcasting, Blu-ray discs, streaming and smartphones. A majority of the 3dvideo content in market is still based on stereo video, w...
详细信息
3dvideo services are emerging in various application domains including cinema, TV broadcasting, Blu-ray discs, streaming and smartphones. A majority of the 3dvideo content in market is still based on stereo video, which is typically coded with the multiview videocoding (MVC) extension of the Advancedvideocoding (H.264/AVC) standard or as frame-compatible stereoscopic video. However, the 3dvideo technologies face challenges as well as opportunities to support more demanding application scenarios, such as immersive 3d telepresence with numerous views and3d perception adaptation for heterogeneous 3ddevices and/or user preferences. The Multiview video plus depth (MVd) format enables depth-image-based rendering (dIBR) of additional viewpoints in the decoding side and hence helps in such advanced application scenarios. This paper reviews the MVC + d standard, which specifies an MVC-compatible MVdcoding format. (C) 2013 Elsevier Inc. All rights reserved.
The future of novel 3ddisplay technologies largely depends on the design of efficient techniques for 3dvideo representation andcoding. Recently, multiple view plus depth video formats have attracted many research e...
详细信息
The future of novel 3ddisplay technologies largely depends on the design of efficient techniques for 3dvideo representation andcoding. Recently, multiple view plus depth video formats have attracted many research efforts since they enable intermediate view estimation and permit to efficiently represent and compress 3dvideo sequences. In this paper, we present spatiotemporal occlusion compensation with panorama view (STOP), a novel 3d video coding technique based on the creation of a panorama view and occlusion coding in terms of spatiotemporal offsets. The panorama picture represents the most of the visual information acquired from multiple views using a single virtual view, characterized by a larger field of view. Encoding the panorama video with state-of-the-art HECV and representing occlusions with simple spatiotemporal ancillary information STOP achieves high-compression ratio and good visual quality with competitive results with respect to competing techniques. Moreover, STOP enables free viewpoint 3d TV applications whilst allowing legacy display to get a bidimensional service using a standardvideo codec and simple cropping operations.
暂无评论