360 degrees video is very popular due to its 360 degrees views of a scene. Although 360 degrees videos are also compressed by a hybrid coding framework like 2D video, its high resolution and serious shape deformation ...
详细信息
360 degrees video is very popular due to its 360 degrees views of a scene. Although 360 degrees videos are also compressed by a hybrid coding framework like 2D video, its high resolution and serious shape deformation affect coding efficiency. In equirectangular projection (ERP) format of 360 degrees videos, if an object moves from equator regions to pole regions or vice versa, large deformation will be introduced and motion estimation cannot find the best-matched part. To solve the above problem, the authors propose to generate a better reference frame for the current to be encoded frame. First, they project the frame prior to the current one from ERP to the sphere and rotate it at an appropriate angle depending on motion vectors. Subsequently, they insert this generated frame to the rear of the reference queue and let the encoder work as usual. The advantage is that the inserted frame has a more similar shape deformation as the current frame, which greatly helps motion estimation and makes full use of 360 degrees video characters. Their method is simple and friendly compatible with the existing compression standard. Experiments prove that their method achieves 1.57% Bjontegaard Delta (BD)-gain compared with standard high efficiency video coding.
Recently deep learning has been introduced to the field of image compression. In this paper, we present a hybrid coding framework that combines entropy coding, deep learning, and traditional codingframework. In the b...
详细信息
Recently deep learning has been introduced to the field of image compression. In this paper, we present a hybrid coding framework that combines entropy coding, deep learning, and traditional codingframework. In the base layer of the encoding, we use convolutional neural networks to lear n the latent representation and importance map of the original image respectively. The importance map is then used to guide the bit allocation of the latent representation . A context model is also developed to help the entropy coding after the masked quantization. Another network is used to get a coarse reconstruction of the image in the base layer. The residual between the input and the coarse reconstruction is then obtained and encoded by the traditional BPG codec as the enhancement layer of the bit stream. We only need to train a basic model and the proposed scheme can realize image compression at different bit rates, thanks to the use of the traditional codec. Experimental results using the Kodak, Urban100 and BSD100 datasets show that the proposed scheme outperforms many deep learning-based methods and traditional codecs including BPG in MS-SSIM metric across a wide range of bit rates. It also exceeds some latest hybrid schemes in RGB444 domain on Kodak dataset in both PSN R and MS-SSIM metrics.
暂无评论