Neural compression has benefited from technological advances such as convolutional neural networks (CNNs) to achieve advanced bitrates, especially in image compression. In neural image compression, an encoder and a de...
详细信息
ISBN:
(纸本)9781728185514
Neural compression has benefited from technological advances such as convolutional neural networks (CNNs) to achieve advanced bitrates, especially in image compression. In neural image compression, an encoder and a decoder can run in parallel on a GPU, so the speed is relatively fast. However, the conventional entropy coding for neural image compression requires serialized iterations in which the probability distribution is estimated by multi-layer CNNs and entropy coding is processed on a CPU. Therefore, the total compression and decompression speed is slow. We propose a fast, practical, GPU-intensive entropy coding framework that consistently executes entropy coding on a GPU through highly parallelized tensor operations, as well as an encoder, decoder, and entropy estimator with an improved network architecture. We experimentally evaluated the speed and rate-distortion performance of the proposed framework and found that we could significantly increase the speed while maintaining the bitrate advantage of neural image compression.
Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new ba...
详细信息
ISBN:
(纸本)9781728180687
Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the-art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.
In this paper, considering the retinal structure of human eye, and the composition characteristics of screen content images (SCIs), a multi-pathway convolutional neural network (CNN) with picture-text competition is p...
详细信息
ISBN:
(纸本)9781665475921
In this paper, considering the retinal structure of human eye, and the composition characteristics of screen content images (SCIs), a multi-pathway convolutional neural network (CNN) with picture-text competition is proposed for SCIs quality assessment. According to the visual mechanism of human retina, we design a retinal structure simulation module, which uses multiple parallel convolution pathways to simulate the parallel transmission of visual signals by bipolar cells and uses a multi-pathway feature fusion (MPFF) module to allocate the weight for each channel to simulate horizontal cells' regulation of the information transmission. In addition, we design an adaptive feature extraction and competition module (AFEC) to directly extract the features of textural and pictorial regions and distribute the weight. Furthermore, the attention module combined with deformable convolution and channel attention can accurately extract image edge features and reduce redundancy of information. Experimental results show that the proposed method is superior to the mainstream methods.
In recent years, with the popularization of 3D technology, stereoscopic image quality assessment (SIQA) has attracted extensive attention. In this paper, we propose a two-stage binocular fusion network for SIQA, which...
详细信息
ISBN:
(纸本)9781728185514
In recent years, with the popularization of 3D technology, stereoscopic image quality assessment (SIQA) has attracted extensive attention. In this paper, we propose a two-stage binocular fusion network for SIQA, which takes binocular fusion, binocular rivalry and binocular suppression into account to imitate the complex binocular visual mechanism in the human brain. Besides, to extract spatial saliency features of the left view, the right view, and the fusion view, saliency generating layers (SGLs) are applied in the network. The SGL apply multi-scale dilated convolution to emphasize essential spatial information of the input features. Experimental results on four public stereoscopic image databases demonstrate that the proposed method outperforms the state-of-the-art SIQA methods on both symmetrical and asymmetrical distortion stereoscopic images.
In this paper, we propose an optimized dual stream convolutional neural network (CNN) considering binocular disparity and fusion compensation for no-reference stereoscopic image quality assessment (SIQA). Different fr...
详细信息
ISBN:
(纸本)9781728185514
In this paper, we propose an optimized dual stream convolutional neural network (CNN) considering binocular disparity and fusion compensation for no-reference stereoscopic image quality assessment (SIQA). Different from previous methods, we extract both disparity and fusion features from multiple levels to simulate hierarchical processing of the stereoscopic images in human brain. Given that the ocular dominance plays an important role in quality evaluation, the fusion weights assignment module (FWAM) is proposed to assign weight to guide the fusion of the left and the right features respectively. Experimental results on four public stereoscopic image databases show that the proposed method is superior to the state-of-the-art SIQA methods on both symmetrical and asymmetrical distortion stereoscopic images.
This paper presents a novel near infrared (NIR) image colorization approach for the Grand Challenge held by 2020 ieee International conference on visualcommunications and imageprocessing (VCIP). A Cycle-Consistent G...
详细信息
ISBN:
(纸本)9781728180687
This paper presents a novel near infrared (NIR) image colorization approach for the Grand Challenge held by 2020 ieee International conference on visualcommunications and imageprocessing (VCIP). A Cycle-Consistent Generative Adversarial Network (CycleGAN) with cross-scale dense connections is developed to learn the color translation from the NIR domain to the RGB domain based on both paired and unpaired data. Due to the limited number of paired NIR-RGB images, data augmentation via cropping, scaling, contrast and mirroring operations have been adopted to increase the variations of the NIR domain. An alternating training strategy has been designed, such that CycleGAN can efficiently and alternately learn the explicit pixel-level mappings from the paired NIR-RGB data, as well as the implicit domain mappings from the unpaired ones. Based on the validation data, we have evaluated our method and compared it with conventional CycleGAN method in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and angular error (AE). The experimental results validate the proposed colorization framework.
With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full...
详细信息
ISBN:
(纸本)9781728185514
With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full of challenges due to the complex relationship between binocular views. In this article, firstly, convolutional neural network (CNN) based on the visual pathway of human visual system (HVS) is built, which simulates different parts of visual pathway such as the optic chiasm, lateral geniculate nucleus (LGN), and visual cortex. Secondly, the two pathways of our method simulate the 'what' and 'where' visual pathway respectively, which are endowed with different feature extraction capabilities. Finally, we find a different application way for 3D-convolution, employing it fuse the information from left and right view, rather than just extracting temporal features in video. The experimental results show that our proposed method is more in line with subjective score and has good generalization.
Underwater images suffer from low contrast, color distortion and visibility degradation due to the light scattering and attenuation. Over the past few years, the importance of underwater image enhancement has increase...
详细信息
ISBN:
(纸本)9781728185514
Underwater images suffer from low contrast, color distortion and visibility degradation due to the light scattering and attenuation. Over the past few years, the importance of underwater image enhancement has increased because of ocean engineering and underwater robotics. Existing underwater image enhancement methods are based on various assumptions. However, it is almost impossible to define appropriate assumptions for underwater images due to the diversity of underwater images. Therefore, they are only effective for specific types of underwater images. Recently, underwater image enhancement algorisms using CNNs and GANS have been proposed, but they are not as advanced as other imageprocessing methods due to the lack of suitable training data sets and the complexity of the issues. To solve the problems, we propose a novel underwater image enhancement method which combines the residual feature attention block and novel combination of multi-scale and multi-patch structure. Multi-patch network extracts local features to adjust to various underwater images which are often Non-homogeneous. In addition, our network includes multi-scale network which is often effective for image restoration. Experimental results show that our proposed method outperforms the conventional method for various types of images.
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limi...
详细信息
ISBN:
(纸本)9781728185514
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limited robustness and generalization ability. Moreover, they often rely on side information not available at test time, that is, they are not universal. We investigate these problems and propose a new GAN image detector based on a limited sub-sampling architecture and a suitable contrastive learning paradigm. Experiments carried out in challenging conditions prove the proposed method to be a first step towards universal GAN image detection, ensuring also good robustness to common image impairments, and good generalization to unseen architectures.
An image anomaly localization method based on the successive subspace learning (SSL) framework, called AnomalyHop, is proposed in this work. AnomalyHop consists of three modules: 1) feature extraction via successive s...
详细信息
ISBN:
(纸本)9781728185514
An image anomaly localization method based on the successive subspace learning (SSL) framework, called AnomalyHop, is proposed in this work. AnomalyHop consists of three modules: 1) feature extraction via successive subspace learning (SSL), 2) normality feature distributions modeling via Gaussian models, and 3) anomaly map generation and fusion. Comparing with state-of-the-art image anomaly localization methods based on deep neural networks (DNNs), AnomalyHop is mathematically transparent, easy to train, and fast in its inference speed. Besides, its area under the ROC curve (ROC-AUC) performance on the MVTec AD dataset is 95.9%, which is among the best of several benchmarking methods.
暂无评论