With the advent of learned image compression, numerous models have been developed. These models make use of non-lineartransforms that are learnt during the training process, where an image is transformed into a laten...
详细信息
ISBN:
(纸本)9781510679344;9781510679351
With the advent of learned image compression, numerous models have been developed. These models make use of non-lineartransforms that are learnt during the training process, where an image is transformed into a latent space, quantized and entropy coded. At the decoder, the quantized latent is recovered and transformed back to image space through a synthesis transform. In this work, we attempt to present an analysis of the energy distribution across channels. In our prior works, we demonstrated the features captured by the analysis transform, that can provide insights into the bitrate distribution across channels. Building on that, we extend our findings with quantitative measurements. We consider various learned image codecs that are based on the variational autoencoder framework and compare them with Karhunen Loeve transform (KLT) in terms of energy compaction. We also compare the closeness of the learned transforms to KLT to study the relationship between the design of classical codecs and learned codecs.
Deep learning based image compression has gained a lot of momentum in recent times. To enable a method that is suitable for image compression and subsequently extended to video compression, we propose a novel deep lea...
详细信息
ISBN:
(纸本)9781728198354
Deep learning based image compression has gained a lot of momentum in recent times. To enable a method that is suitable for image compression and subsequently extended to video compression, we propose a novel deep learning model architecture, where the task of image compression is divided into two sub-tasks, learning structural information from luminance channel and color from chrominance channels. The model has two separate branches to process the luminance and chrominance components. The color difference metric CIEDE2000 is employed in the loss function to optimize the model for color fidelity. We demonstrate the benefits of our approach and compare the performance to other codecs. Additionally, the visualization and analysis of latent channel impulse response is performed.
Deep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representa...
详细信息
Deep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representation of the input image. Advanced techniques such as vector quantization, context-adaptive arithmetic coding and variable-rate compression have been implemented in these auto-encoders. Notably, these networks rely on an end-to-end approach, which fundamentally differs from hybrid, block-based video coding systems. Therefore, signal-dependent encoder optimizations have not been thoroughly investigated for VAEs yet. However, rate-distortion optimized encoding heavily determines the compression performance of state-of-the-art video codecs. Designing such optimizations for non-linear, multi-layered networks requires to understand the relationship between the quantization, the bit allocation of the features and the distortion. Therefore, this paper examines the rate-distortion performance of a variable-rate VAE. In particular, one demonstrates that the trained encoder network typically finds features with a near-optimal bit allocation across the channels. Furthermore, one approximates the relationship between distortion and quantization by a higher-order polynomial, whose coefficients can be robustly estimated. Based on these considerations, the authors investigate an encoding algorithm for the Lagrange optimization, which significantly improves the coding efficiency.
暂无评论