The combination of Hamiltonian formalism and neural networks is playing an important role in dealing with chaotic systems. Aiming at the problem of motion control under the condition of unknown physical quantity and i...
详细信息
The combination of Hamiltonian formalism and neural networks is playing an important role in dealing with chaotic systems. Aiming at the problem of motion control under the condition of unknown physical quantity and incomplete observation set, a trajectory prediction model based on conditional Hamiltonian generating network (CHGN) for incomplete observation image sequences is proposed. CHGN is composed of conditional autoencoder (CVAE), Hamiltonian neural network (HNN) and Velocity-Verlet integrator. CVAE encoder converts the short-term continuous observation image sequence into target motion state features represented by generalized coordinates and generalized momentum, and generates the trajectory prediction image at the specified time. HNN is used to learn potential Hamiltonian physical quantities, so as to understand more chaotic system dynamics information to realize state cognition. Velocity-Verlet integrator predicts the motion state at any moment according to the Hamiltonian learned by HNN at the current moment. The motion state and the specified time are used as the input of CVAE decoder to generate the target prediction image from the potential motion space. Experimental results show that CHGN can accurately predict target trajectories over a long period of time based on incomplete short-term image sequences, and has better performance with minimum mean square error(MSE) on three physical system datasets than existing deep learning methods.
Recently, deep learning has demonstrated impressive performance in image compression. Methods, that achieve and even outperform conventional codecs performances, are continually emerging. However, most of them need to...
详细信息
ISBN:
(纸本)9781665432870
Recently, deep learning has demonstrated impressive performance in image compression. Methods, that achieve and even outperform conventional codecs performances, are continually emerging. However, most of them need to train and deploy separate networks for rate adaptation. This is impractical and extensive in terms of memory cost and power consumption, especially for broad bitrate ranges. Further, methods that consider the semantic-important structure of the image are extremely sparse. This leads to non-optimized bit allocation for the eye-catching foreground details, that have to be preserved for the almost all computer vision applications. Towards this end, we establish an end-to-end multi-rate deep semantic image compression with quantized conditional autoencoder. It includes two neural networks for the semantic analysis and image compression, respectively. The semantic analysis network extracts the essential semantic regions of the input image, and calculates the Semantic-Important Structural SIMilarity (SI-SSIM) index for each of them. The compression network is then trained to optimize a multi-loss function based on SI-SSIM and conditioned on the activation bitwidths. Performances of our model are evaluated on the JPEG AI dataset for objective and perceptual quality metrics. Obtained results show that our method yields higher performances over JPEG, JPEG 2000 and HEVC intra baselines and competitive performances with VVC intra.
The rise of variational autoencoders for image and video compression has opened the door to many elaborate coding techniques. One example here is the possibility of conditional interframe coding. Here, instead of tran...
详细信息
ISBN:
(纸本)9781665492577
The rise of variational autoencoders for image and video compression has opened the door to many elaborate coding techniques. One example here is the possibility of conditional interframe coding. Here, instead of transmitting the residual between the original frame and the predicted frame (often obtained by motion compensation), the current frame is transmitted under the condition of knowing the prediction signal. In practice, conditional coding can be straightforwardly implemented using a conditional autoencoder, which has also shown good results in recent works. In this paper, we provide an information theoretical analysis of conditional coding for inter frames and show in which cases gains compared to traditional residual coding can be expected. We also show the effect of information bottlenecks which can occur in practical video coders in the prediction signal path due to the network structure, as a consequence of the data-processing theorem or due to quantization. We demonstrate that conditional coding has theoretical benefits over residual coding but that there are cases in which the benefits are quickly canceled by small information bottlenecks of the prediction signal.
This paper introduces AIVC, an end-to-end neural video codec. It is based on two conditional autoencoders MNet and CNet, for motion compensation and coding. AIVC learns to compress videos using any coding configuratio...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
This paper introduces AIVC, an end-to-end neural video codec. It is based on two conditional autoencoders MNet and CNet, for motion compensation and coding. AIVC learns to compress videos using any coding configurations through a single end-to-end rate-distortion optimization. Furthermore, it offers performance competitive with the recent video coder HEVC under several established test conditions. A comprehensive ablation study is performed to evaluate the benefits of the different modules composing AIVC. The implementation is made available at https://***/AIVC/.
Recently, image compression codecs based on Neural Networks (NN) outperformed the state-of-art classic ones such as BPG, an image format based on HEVC intra. However, the typical NN codec has high complexity, and it h...
详细信息
ISBN:
(纸本)9781665492577
Recently, image compression codecs based on Neural Networks (NN) outperformed the state-of-art classic ones such as BPG, an image format based on HEVC intra. However, the typical NN codec has high complexity, and it has limited options for parallel data processing. In this work, we propose a conditional separation principle that aims to improve parallelization and lower the computational requirements of an NN codec. We present a conditional Color Separation (CCS) codec which follows this principle. The color components of an image are split into primary and non-primary ones. The processing of each component is done separately, by jointly trained networks. Our approach allows parallel processing of each component, flexibility to select different channel numbers, and an overall complexity reduction. The CCS codec uses over 40% less memory, has 2x faster encoding and 22% faster decoding speed, with only 4% BD-rate loss in RGB PSNR compared to our baseline model over BPG.
暂无评论