Colorization of near-infrared (NIR) images is a challenging problem due to the different material properties at the infared wavelenghts, thus reducing the correlation with visible images. In this paper, we study how g...
详细信息
ISBN:
(纸本)9781728180687
Colorization of near-infrared (NIR) images is a challenging problem due to the different material properties at the infared wavelenghts, thus reducing the correlation with visible images. In this paper, we study how graph-convolutional neural networks allow exploiting a more powerful inductive bias than standard CNNs, in the form of non-local self-similiarity. Its impact is evaluated by showing how training with mean squared error only as loss leads to poor results with a standard CNN, while the graph-convolutional network produces significantly sharper and more realistic colorizations.
This paper presents a deep learning-based audio-in-image watermarking scheme. Audio-in-image watermarking is the process of covertly embedding and extracting audio watermarks on a cover-image. Using audio watermarks c...
详细信息
ISBN:
(纸本)9781728185514
This paper presents a deep learning-based audio-in-image watermarking scheme. Audio-in-image watermarking is the process of covertly embedding and extracting audio watermarks on a cover-image. Using audio watermarks can open up possibilities for different downstream applications. For the purpose of implementing an audio-in-image watermarking that adapts to the demands of increasingly diverse situations, a neural network architecture is designed to automatically learn the watermarking process in an unsupervised manner. In addition, a similarity network is developed to recognize the audio watermarks under distortions, therefore providing robustness to the proposed method. Experimental results have shown high fidelity and robustness of the proposed blind audio-in-image watermarking scheme.
RDPlot is an open source GUI application for plotting Rate-Distortion (RD)-curves and calculating Bjontegaard Delta (BD) statistics [1]. It supports parsing the output of commonly used reference software packages, par...
详细信息
With the rapid development of three-dimensional (3D) technology, the effective stereoscopic image quality assessment (SIQA) methods are in great demand. Stereoscopic image contains depth information, making it much mo...
详细信息
ISBN:
(纸本)9781728180687
With the rapid development of three-dimensional (3D) technology, the effective stereoscopic image quality assessment (SIQA) methods are in great demand. Stereoscopic image contains depth information, making it much more challenging in exploring a reliable SIQA model that fits human visual system. In this paper, a no-reference SIQA method is proposed, which better simulates binocular fusion and binocular rivalry. The proposed method applies convolutional neural network to build a dual-channel model and achieve a long-term process of feature extraction, fusion, and processing. What's more, both high and low frequency information are used effectively. Experimental results demonstrate that the proposed model outperforms the state-of-the-art no-reference SIQA methods and has a promising generalization ability.
This paper proposes a new neural network for enhancing underexposed images. Instead of the decomposition method based on Retinex theory, we introduce smooth dilated convolution to estimate global illumination of the i...
详细信息
ISBN:
(纸本)9781728180687
This paper proposes a new neural network for enhancing underexposed images. Instead of the decomposition method based on Retinex theory, we introduce smooth dilated convolution to estimate global illumination of the input image, and implement an end-to-end learning network model. Based on this model, we formulate a multi-term loss function that combines content, color, texture and smoothness losses. Our extensive experiments demonstrate that this method is superior to other methods in underexposed image enhancement. It can cover more color details and be applied to various underexposed images robustly.
With the development of deep learning, many methods on image denoising have been proposed processingimages on a fixed scale or multi-scale which is usually implemented by convolution or deconvolution. However, excess...
详细信息
ISBN:
(纸本)9781728180687
With the development of deep learning, many methods on image denoising have been proposed processingimages on a fixed scale or multi-scale which is usually implemented by convolution or deconvolution. However, excessive scaling may lose image detail information, and the deeper the convolutional network the easier to lose network gradient. Diamond Denoising Network (DmDN) is proposed in this paper, which mainly based on a fixed scale and meanwhile considering the multi-scale feature information by using the Diamond-Shaped (DS) module to deal with the problems above. Experimental results show that DmDN is effective in image denoising.
This paper presents a concise end-to-end visual analysis motivated super-resolution model VASR for image reconstruction. Compatible with the existing machine vision feature coding framework, the features extracted fro...
详细信息
ISBN:
(纸本)9781665475921
This paper presents a concise end-to-end visual analysis motivated super-resolution model VASR for image reconstruction. Compatible with the existing machine vision feature coding framework, the features extracted from the machine vision task model are super-resolution amplified to reconstruct the original image for human vision. The experimental results show that without additional bit-streams, VASR can well complete the task of image reconstruction based on the extracted machine features, and has achieved good results on COCO, Openimages, TVD, and DIV2K datasets.
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various ...
详细信息
ISBN:
(纸本)9781728185514
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various analyses of the input image, such as object detection or segmentation, besides decoding the image. At the same time, privacy concerns around visual analytics have grown in response to the increasing capabilities of such systems to reveal private information. In this paper, we propose a method to make latent-space inference more privacy-friendly using mutual information-based criteria. In particular, we show how organizing and compressing the latent representation of the image according to task-specific mutual information can make the model maintain high analytics accuracy while becoming less able to reconstruct the input image and thereby reveal private information.
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy pr...
详细信息
ISBN:
(纸本)9781728185514
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy protection, etc. In image steganography, secret data is generally embedded within the image through an additional step after a mandatory image enhancement process. In this paper, we propose the idea of embedding data during the image enhancement process. This saves the additional work required to separately encode the data inside the cover image. We used the Alpha-Trimmed mean filter for image enhancement and XOR of the 6 MSBs for embedding the two bits of the bitstream in the 2 LSBs whereas the extraction is a reverse process. Our obtained quantitative and qualitative results are better than a methodology presented in a very recent paper.
Learning-based image compression has reached the performance of classical methods such as BPG. One common approach is to use an autoencoder network to map the pixel information to a latent space and then approximate t...
详细信息
ISBN:
(纸本)9781728185514
Learning-based image compression has reached the performance of classical methods such as BPG. One common approach is to use an autoencoder network to map the pixel information to a latent space and then approximate the symbol probabilities in that space with a context model. During inference, the learned context model provides symbol probabilities, which are used by the entropy encoder to obtain the bitstream. Currently, the most effective context models use autoregression, but autoregression results in a very high decoding complexity due to the serialized data processing. In this work, we propose a method to parallelize the autoregressive process used for image compression. In our experiments, we achieve a decoding speed that is over 8 times faster than the standard autoregressive context model almost without compression performance reduction.
暂无评论