Recent advancements in learned image compression methods have demonstrated superior rate-distortion performance and remarkable potential compared to traditional compression techniques. However, the core operation of q...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Recent advancements in learned image compression methods have demonstrated superior rate-distortion performance and remarkable potential compared to traditional compression techniques. However, the core operation of quantization, inherent to lossy image compression, introduces errors that can degrade the quality of the reconstructed image. To address this challenge, we propose a novel Quantization Error Compensator (QEC), which leverages spatial context within latent representations and hyperprior information to effectively mitigate the impact of quantization error. Moreover, we propose a tailored quantization error optimization training strategy to further improve rate-distortion performance. Notably, QEC serves as a lightweight, plug-and-play module, offering high flexibility and seamless integration into various learned image compression methods. Extensive experimental results consistently demonstrate significant coding efficiency improvements achievable by incorporating the proposed QEC into state-of-the-art methods, with a slight increase in runtime.
Learning-based image deraining methods have achieved remarkable success in the past few decades. Currently, most deraining architectures are developed by human experts, which is a laborious and error-prone process. In...
详细信息
ISBN:
(纸本)9781728185514
Learning-based image deraining methods have achieved remarkable success in the past few decades. Currently, most deraining architectures are developed by human experts, which is a laborious and error-prone process. In this paper, we present a study on employing neural architecture search (NAS) to automatically design deraining architectures, dubbed AutoDerain. Specifically, we first propose an U-shaped deraining architecture, which mainly consists of residual squeeze-andexcitation blocks (RSEBs). Then, we define a search space, where we search for the convolutional types and the use of the squeeze-and-excitation block. Considering that the differentiable architecture search is memory-intensive, we propose a memory-efficient differentiable architecture search scheme (MDARTS). In light of the success of training binary neural networks, MDARTS optimizes architecture parameters through the proximal gradient, which only consumes the same GPU memory as training a single deraining model. Experimental results demonstrate that the architecture designed by MDARTS is superior to manually designed derainers.
In stereo image super-resolution (SR), it is equally important to utilize intra-view and cross-view information. However, most existing methods only focus on the exploration of cross-view information and neglect the f...
详细信息
ISBN:
(纸本)9781728185514
In stereo image super-resolution (SR), it is equally important to utilize intra-view and cross-view information. However, most existing methods only focus on the exploration of cross-view information and neglect the full mining of intra-view information, which limits the reconstruction performance of these methods. Since single image SR (SISR) methods are powerful in intra-view information exploitation, we propose to introduce the knowledge distillation strategy to transfer the knowledge of a SISR network (teacher network) to a stereo image SR network (student network). With the help of the teacher network, the student network can easily learn more intra-view information. Specifically, we propose pixel-wise distillation as the implementation method, which not only improves the intra-view information extraction ability of student network, but also ensures the effective learning of cross-view information. Moreover, we propose a lightweight student network named Adaptive Residual Feature Aggregation network (ARFAnet). Its main unit, the ARFA module, can aggregate informative residual features and produce more representative features for image reconstruction. Experimental results demonstrate that our teacher-student network achieves state-of-the-art performance on all benchmark datasets.
Uniform scalar quantizers are widely used in image coding. They are known to be optimum entropy constrained scalar quantizers within the high resolution assumption. In this paper, we focus on the design of nearly unif...
详细信息
ISBN:
(纸本)0819444111
Uniform scalar quantizers are widely used in image coding. They are known to be optimum entropy constrained scalar quantizers within the high resolution assumption. In this paper, we focus on the design of nearly uniform scalar quantizers for high performance coding of wavelet coefficients whatever the bitrate is. Some codecs use uniform scalar quantizers with a zero quantization bin size (deadzone) equal to two times the other quantization bin sizes (for example JPEG2000). We address the problem of deadzone size optimization using distortion rate considerations. The advantages of the proposed method are that the quantizer design is adapted to both the source statistics and the compression ratio. Our method is based on statistical information of the wavelet coefficients distribution. It provides experimental gains up to 0.19 dB.
In this paper, we proposed an optimized model based on the visual attention mechanism(VAM) for no-reference stereoscopic image quality assessment (SIQA). A CNN model is designed based on dual attention mechanism (DAM)...
详细信息
ISBN:
(纸本)9781728180687
In this paper, we proposed an optimized model based on the visual attention mechanism(VAM) for no-reference stereoscopic image quality assessment (SIQA). A CNN model is designed based on dual attention mechanism (DAM), which includes channel attention mechanism and spatial attention mechanism. The channel attention mechanism can give high weight to the features with large contribution to final quality, and small weight to features with low contribution. The spatial attention mechanism considers the inner region of a feature, and different areas are assigned different weights according to the importance of the region within the feature. In addition, data selection strategy is designed for CNN model. According to VAM, visual saliency is applied to guide data selection, and a certain proportion of saliency patches are employed to fine tune the network. The same operation is performed on the test set, which can remove data redundancy and improve algorithm performance. Experimental results on two public databases show that the proposed model is superior to the state-of-the-art SIQA methods. Cross-database validation shows high generalization ability and high effectiveness of our model.
This paper used Time-Frequency Analysis (TFA) techniques for signal processing on tasks of computer vision. Our main idea is as follows: To build a simple network architecture without two or more convolutional neural ...
详细信息
ISBN:
(纸本)9781665475921
This paper used Time-Frequency Analysis (TFA) techniques for signal processing on tasks of computer vision. Our main idea is as follows: To build a simple network architecture without two or more convolutional neural networks (CNNs), analyze hidden features by Discrete Wavelet Transform (DWT), and send them into filters as weights by convolutions, transformers or other methods. And we do not need to build the network with 2 or more stages to accomplish this idea. Actually, we try to directly use TFA skills on CNN to build one-stage network. Networks which build by this way not only keep their outstanding performance, but also cost lower computing resources. In this paper, we mainly use DWT on CNN to solve image inpainting problems. And the results show that our model can work stably in frequency domain to realize free-form image inpainting.
With the emergence of various machine-to-machine and machine-to-human tasks with deep learning, the amount of deep feature data is increasing. Deep product quantization is widely applied in deep feature retrieval task...
详细信息
ISBN:
(纸本)9781728185514
With the emergence of various machine-to-machine and machine-to-human tasks with deep learning, the amount of deep feature data is increasing. Deep product quantization is widely applied in deep feature retrieval tasks and has achieved good accuracy. However, it does not focus on the compression target primarily, and its output is a fixed-length quantization index, which is not suitable for subsequent compression. In this paper, we propose an entropy-based deep product quantization algorithm for deep feature compression. Firstly, it introduces entropy into hard and soft quantization strategies, which can adapt to the codebook optimization and codeword determination operations in the training and testing processes respectively. Secondly, the loss functions related to entropy are designed to adjust the distribution of quantization index, so that it can accommodate to the subsequent entropy coding module. Experimental results carried on retrieval tasks show that the proposed method can be generally combined with deep product quantization and its extended schemes, and can achieve a better compression performance under near lossless condition.
Depth estimation of light field images is a crucial technique in various applications, including 3D reconstruction, autonomous driving, and object tracking. However, current deep-learning methods ignore the geometric ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Depth estimation of light field images is a crucial technique in various applications, including 3D reconstruction, autonomous driving, and object tracking. However, current deep-learning methods ignore the geometric information of the light field image and are limited to learning repetitive textures, which leads to inaccurate estimates of depth. The paper proposes a light field depth estimation network that fuses multi-scale semantic information with geometric information to address the problem of non-adaptation for repeated texture regions. The main focus of the network is the semantic and geometric information fusion (SGI) module, which can adaptively combine semantic and geometric information to improve the efficiency of cost aggregation. Furthermore, SGI module establishes a direct link between feature extraction and cost aggregation, providing feedback for feature extraction and guiding more efficient feature extraction. The experimental results from the optical field synthesis dataset HCI 4D demonstrate that the method has high accuracy and generalisation performance.
Recent advancements in neural compression have surpassed traditional codecs in PSNR and MS-SSIM measurements. However, at low bit-rates, these methods can introduce visually displeasing artifacts, such as blurring, co...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Recent advancements in neural compression have surpassed traditional codecs in PSNR and MS-SSIM measurements. However, at low bit-rates, these methods can introduce visually displeasing artifacts, such as blurring, color shifting, and texture loss, thereby compromising perceptual quality of images. To address these issues, this study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss, to enhance the perceptual quality of image reconstructions. Additionally, we have implemented a latent refinement process to generate content-aware latent codes. These codes adhere to bit-rate constraints, and prioritize bit allocation to regions of greater importance. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
Synthetic DNA has received much attention recently as a long-term archival medium alternative due to its high density and durability characteristics. However, most current work has primarily focused on using DNA as a ...
详细信息
ISBN:
(纸本)9781728185514
Synthetic DNA has received much attention recently as a long-term archival medium alternative due to its high density and durability characteristics. However, most current work has primarily focused on using DNA as a precise storage medium. In this work, we take an alternate view of DNA. Using neural-network-based compression techniques, we transform images into a latent-space representation, which we then store on DNA. By doing so, we transform DNA into an approximate image storage medium, as images generated back from DNA are only approximate representations of the original images. Using several datasets, we investigate the storage benefits of approximation, and study the impact of DNA storage errors (substitutions, indels, bias) on the quality of approximation. In doing so, we demonstrate the feasibility and potential of viewing DNA as an approximate storage medium.
暂无评论