Based on the measured latitude and longitude, users can freely view different perspectives of the omnidirectional image. Typically, omnidirectional images are represented in the equirectangular projection (ERP) format...
详细信息
Based on the measured latitude and longitude, users can freely view different perspectives of the omnidirectional image. Typically, omnidirectional images are represented in the equirectangular projection (ERP) format. Although ERP images suffer from distortion and redundancy due to oversampling, making traditional codec inefficient, they maintain visual consistency and enhance compatibility with deep learning-based image processing tools. This has led to the emergence of end-to-end omnidirectional image compression methods based on the ERP format. In fact, transformcoding, a key component in learned planar image compression, has not yet been fully explored in the domain of learned omnidirectional image compression. In this paper, we propose a transformcoding method with adaptivelatitude-aware and importance-activated features for omnidirectional image compression. Specifically, the adaptivelatitude-aware mechanism comprises two modules. The first module, termed adaptivelatitude-aware Module (ALAM), employs rectangular dilated convolutional kernels of multiple sizes to perceive distortion redundancy across different latitudes, followed by latitude-adaptive weighting to select optimal features for respective latitudes. The second module, named Multi-scale Convolutional Gated Feedforward Network (MCGFN), fully exploits local contextual information while suppressing feature redundancy induced by diverse dilated convolutions in the first module. Furthermore, to further reduce ERP redundancy, we design an importance-activated spatial feature transform module that regulates latent representations to allocate more bits to significant regions. Experimental results demonstrate that our proposed method outperforms existing VVC standards and learning-based omnidirectional image compression approaches at medium-to-high bitrates while maintaining low computational complexity.
暂无评论