In this paper, various motion vector predictors are analyzed and their selection method is proposed for 3d video coding. In a ubiquitous multimedia system, video compression has been expected to be an important elemen...
详细信息
ISBN:
(纸本)9780769544298
In this paper, various motion vector predictors are analyzed and their selection method is proposed for 3d video coding. In a ubiquitous multimedia system, video compression has been expected to be an important element since an available bandwidth is very limited. In the proposed method, several motion vector predictors are competed each other with the slightly modified rate-distortion criterion. Spatial, temporal, and inter-view predictors are considered as motion vector predictors to reduce spatial, temporal, and inter-view redundancies, respectively. The proposed method increases the motion vector coding efficiency by selecting the best motion vector predictor among them. Accordingly, overall bitrates are reduced by 5.2 % in average, up to 6.1 % compared to reference software JMVC 6.0 in terms of the Bjontegaard Metric.
Research in multimedia is always investigating new ways of improving the immersive experience of the users. One current solution consists in designing systems which offer a high level of interactivity, such as multivi...
详细信息
ISBN:
(纸本)9781457713033
Research in multimedia is always investigating new ways of improving the immersive experience of the users. One current solution consists in designing systems which offer a high level of interactivity, such as multiview content navigation where the point of view can be changed while watching at a video sequence (e. g., free viewpoint television, gaming, etc.). The coding algorithm designed for the transmission of such media streams must be adapted to these novel decoder needs. However, video plus depth data transmission is usually performed by considering the information flows as two sequences encoded with MVC schemes. Whereas it achieves good compression performance, this coding approach is not appropriate for interactive applications since the decoding of a frame often requires the prior transmission anddecoding of several reference frames. Moreover, the techniques recently developed to improve interactivity are generally implemented at the decoder, whose computational complexity requirements are augmented. In this paper, we propose a novel coding scheme for video plus depth sequences that is adapted to user navigation;contrarily to several common approaches, the additional complexity is added on the encoder side so that the decoder stays simple. We further propose to limit the additional bandwidth imposed by interactivity requirements by designing a rate allocation algorithm that builds on a model of the user behavior. A first version of our novel coding architecture is evaluated in terms of rate-distortion performance, where it is shown to offer a high interactivity at a reasonable bandwidth cost.
Multiview videocoding is one of the key techniques to realize the 3dvideo system. MPEG started a standardization activity on 3dVC (3d video coding) in 2007. 3dVC is based on multiview videocoding. MPEG finalized th...
详细信息
ISBN:
(纸本)9783642253454
Multiview videocoding is one of the key techniques to realize the 3dvideo system. MPEG started a standardization activity on 3dVC (3d video coding) in 2007. 3dVC is based on multiview videocoding. MPEG finalized the standard for multiview videocoding (MVC) based on H.264/AVC in 2008. However, High Efficiency videocoding (HEVC) which is a 2dvideocoding standard under developing outperforms the MVC although it does not employ interview prediction. Thus, we designed a new multiview videocoding method based on HEVC. Interview prediction was added into HEVC and some coding tools were refined to be proper to MVC. The encoded multiple bitstreams are assembled into one bitstream and it is decoded into multiview video at decoder. From experimental results, we confirmed that the proposed MVC based on HEVC is much better than H.264/AVC, MVC, and HEVC. It achieves about 59.95% bit saving compared to JMVC simulcast at the same quality.
depth image Based Rendering (dIBR) is a technique that generates virtual views for multiview video applications from 3dvideo represented in color plus depth format. The depth map is not viewed by end users. However, ...
详细信息
depth image Based Rendering (dIBR) is a technique that generates virtual views for multiview video applications from 3dvideo represented in color plus depth format. The depth map is not viewed by end users. However, it helps to generate different views as required by the application. Therefore, the depth maps need to be compressed in a way that it minimizes distortions in the rendered views. By doing so, it would be possible to generate high quality virtual views, using compresseddepth maps. This paper presents two mode selection techniques based on genetic algorithms for encodingdepth maps. In the proposed techniques, the encoding modes are decided so that the distortion in the rendered view, v is, minimized. Simulation results illustrate that proposed techniques improve the objective quality of the rendered virtual views by up to 2 dB over the Lagrange Optimization based mode selection technique that considers the distortions only in the depth map.
A technique to minimize distortions in synthesized virtual views, while encodingdepth maps that are used in depth Image Based Rendering (dIBR) applications is proposed. depth maps are not viewed by end users, but are...
详细信息
A technique to minimize distortions in synthesized virtual views, while encodingdepth maps that are used in depth Image Based Rendering (dIBR) applications is proposed. depth maps are not viewed by end users, but are used for virtual view generation. Therefore, it is important to compress depth maps in a way that it minimizes distortion in views rendered with them. In doing so, it would be possible to generate high quality virtual views using compresseddepth maps. Firstly, an error model to approximate rendering distortion caused by disparity changes is proposed. Thereafter, this error model is used at the encoding mode selection stage of codingdepth maps. Experimental results illustrate an average bit rate saving of 19%-76%, compared with the mode selection method, which is based on minimizing pixel errors only of the depth map. Further, encodingdepth maps with the proposed technique improves the overall visual quality of rendered views.(1)
The Emergence of three dimensional (3d) video applications, based on depth Image Based Rendering (dIBR) has brought about new dimensions to the video transmission problem, due to the need to transmit additional depth ...
详细信息
ISBN:
(纸本)9781424463794
The Emergence of three dimensional (3d) video applications, based on depth Image Based Rendering (dIBR) has brought about new dimensions to the video transmission problem, due to the need to transmit additional depth information to the receiver. Until the transmission problem of 3dvideo is adequately addressed, consumer applications based on 3dvideo will not gain much popularity. Exploiting the unique correlations that exist between the color and their corresponding depth images, will lead to more error resilient video encoding schemes for 3dvideo. In this paper we present an error resilient 3dvideo communication scheme that exploits the correlation of motion vectors in color anddepth video streams. The presented method achieves up to 0.8 dB gain for color sequences and up to 0.7 dB gain for depth sequences over error prone communication channels.
This paper presents a system architecture of an acquisition, compression and rendering system for 3d-TV and free-viewpoint video applications. We show that the proposed system yields two distinct advantages. First, it...
详细信息
This paper presents a system architecture of an acquisition, compression and rendering system for 3d-TV and free-viewpoint video applications. We show that the proposed system yields two distinct advantages. First, it achieves an efficient compression of 3d/multi-view video by extending a standard H.264 encoder such that near backward compatibility is retained. Second, the proposed system can efficiently compress both 3d-TV and free-viewpoint multi-view videodatasets using the single proposed system architecture.
A 3dvideo stream is typically obtained from a set of synchronized cameras, which are simultaneously capturing the same scene (multiview video). This technology enables applications such as free-viewpoint video which ...
详细信息
ISBN:
(纸本)9780819466037
A 3dvideo stream is typically obtained from a set of synchronized cameras, which are simultaneously capturing the same scene (multiview video). This technology enables applications such as free-viewpoint video which allows the viewer to select his preferred viewpoint, or 3d TV where the depth of the scene can be perceived using a special display. Because the user-selected view does not always correspond to a camera position, it may be necessary to synthesize a virtual camera view. To synthesize such a virtual view, we have adopted a depthimage-based rendering technique that employs one depth map for each camera. Consequently, a remote rendering of the 3dvideo requires a compression technique for texture anddepth data. This paper presents a predictive-coding algorithm for the compression of depth images across multiple views. The presented algorithm provides (a) an improvedcoding efficiency for depth images over block-based motion-compensation encoders (H.264), and (b), a random access to different views for fast rendering. The proposeddepth-prediction technique works by synthesizing/computing the depth of 3d points based on the reference depth image. The attractiveness of the depth-prediction algorithm is that the prediction of depth data avoids an independent transmission of depth for each view, while simplifying the view interpolation by synthesizing depth images for arbitrary view points. We present experimental results for several multiview depth sequences, that result in a quality improvement of up to 1.8 dB as compared to H.264 compression.
This paper describes our experience making a short stereoscopic movie visualizing the development of structure in the universe during the 13.7 billion years from the Big Bang to the present day. Aimed at a general aud...
详细信息
ISBN:
(纸本)0819460958
This paper describes our experience making a short stereoscopic movie visualizing the development of structure in the universe during the 13.7 billion years from the Big Bang to the present day. Aimed at a general audience for the Royal Society's 2005 Summer Science Exhibition, the movie illustrates how the latest cosmological theories based on dark matter anddark energy are capable of producing structures as complex as spiral galaxies and allows the viewer to directly compare observations from the real universe with theoretical results. 3d is an inherent feature of the cosmology data sets and stereoscopic visualization provides a natural way to present the images to the viewer, in addition to allowing researchers to visualize these vast, complex data sets. The presentation of the movie used passive, linearly polarized projection onto a 2m wide screen but it was also required to playback on a Sharp Rd3ddisplay and in anaglyph projection at venues without dedicated stereoscopic display equipment. Additionally lenticular prints were made from key images in the movie. We discuss the following technical challenges during the stereoscopic production process;1) Controlling the depth presentation, 2) Editing the stereoscopic sequences, 3) Generating compressed movies in display specific formats. We conclude that the generation of high quality stereoscopic movie content using desktop tools and equipment is feasible. This does require careful quality control and manual intervention but we believe these overheads are worthwhile when presenting inherently 3ddata as the result is significantly increased impact and better understanding of complex 3d scenes.
暂无评论