With the development of deep learning, many methods on image denoising have been proposed processingimages on a fixed scale or multi-scale which is usually implemented by convolution or deconvolution. However, excess...
详细信息
ISBN:
(纸本)9781728180687
With the development of deep learning, many methods on image denoising have been proposed processingimages on a fixed scale or multi-scale which is usually implemented by convolution or deconvolution. However, excessive scaling may lose image detail information, and the deeper the convolutional network the easier to lose network gradient. Diamond Denoising Network (DmDN) is proposed in this paper, which mainly based on a fixed scale and meanwhile considering the multi-scale feature information by using the Diamond-Shaped (DS) module to deal with the problems above. Experimental results show that DmDN is effective in image denoising.
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limi...
详细信息
ISBN:
(纸本)9781728185514
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limited robustness and generalization ability. Moreover, they often rely on side information not available at test time, that is, they are not universal. We investigate these problems and propose a new GAN image detector based on a limited sub-sampling architecture and a suitable contrastive learning paradigm. Experiments carried out in challenging conditions prove the proposed method to be a first step towards universal GAN image detection, ensuring also good robustness to common image impairments, and good generalization to unseen architectures.
This paper proposes a new neural network for enhancing underexposed images. Instead of the decomposition method based on Retinex theory, we introduce smooth dilated convolution to estimate global illumination of the i...
详细信息
ISBN:
(纸本)9781728180687
This paper proposes a new neural network for enhancing underexposed images. Instead of the decomposition method based on Retinex theory, we introduce smooth dilated convolution to estimate global illumination of the input image, and implement an end-to-end learning network model. Based on this model, we formulate a multi-term loss function that combines content, color, texture and smoothness losses. Our extensive experiments demonstrate that this method is superior to other methods in underexposed image enhancement. It can cover more color details and be applied to various underexposed images robustly.
An image anomaly localization method based on the successive subspace learning (SSL) framework, called AnomalyHop, is proposed in this work. AnomalyHop consists of three modules: 1) feature extraction via successive s...
详细信息
ISBN:
(纸本)9781728185514
An image anomaly localization method based on the successive subspace learning (SSL) framework, called AnomalyHop, is proposed in this work. AnomalyHop consists of three modules: 1) feature extraction via successive subspace learning (SSL), 2) normality feature distributions modeling via Gaussian models, and 3) anomaly map generation and fusion. Comparing with state-of-the-art image anomaly localization methods based on deep neural networks (DNNs), AnomalyHop is mathematically transparent, easy to train, and fast in its inference speed. Besides, its area under the ROC curve (ROC-AUC) performance on the MVTec AD dataset is 95.9%, which is among the best of several benchmarking methods.
Diagnosing wider disease in retina is more significant in Ophthalmology. Using imageprocessing techniques to identify the retinal diseases has proven more precise than other technology implementations, rapidly. Certa...
详细信息
With the rapid development of 3D technologies, effective no-reference stereoscopic image quality assessment (NR-SIQA) methods are in great demand. In this paper, we propose a parallel multi-scale feature extraction co...
详细信息
ISBN:
(纸本)9781665475921
With the rapid development of 3D technologies, effective no-reference stereoscopic image quality assessment (NR-SIQA) methods are in great demand. In this paper, we propose a parallel multi-scale feature extraction convolution neural network (CNN) model combined with novel binocular feature interaction consistent with human visual system (HVS). In order to simulate the characteristics of HVS sensing multi-scale information at the same time, parallel multi-scale feature extraction module (PMSFM) followed by compensation information is proposed. And modified convolutional block attention module (MCBAM) with less computational complexity is designed to generate visual attention maps for the multi-scale features extracted by the PMSFM. In addition, we employ cross-stacked strategy for multi-level binocular fusion maps and binocular disparity maps to simulate the hierarchical perception characteristics of HVS. Experimental results show that our method is superior to the state-of-the-art metrics and achieves an excellent performance.
Advances in media compression indicate significant potential to drive future media coding standards, e.g., Joint Photographic Experts Group's learning-based image coding technologies (JPEG AI) and Joint Video Expe...
详细信息
ISBN:
(纸本)9781728185514
Advances in media compression indicate significant potential to drive future media coding standards, e.g., Joint Photographic Experts Group's learning-based image coding technologies (JPEG AI) and Joint Video Experts Team's (JVET) deep neural networks (DNN) based video coding. These codecs in fact represent a new type of media format. As a dire consequence, traditional media security and forensic techniques will no longer be of use. This paper proposes an initial study on the effectiveness of traditional watermarking on two state-of-the-art learning based image coding. Results indicate that traditional watermarking methods are no longer effective. We also examine the forensic trails of various DNN architectures in the learning based codecs by proposing a residual noise based source identification algorithm that achieved 79% accuracy.
Single image desnowing is an important and challenge task for lots of computer vision applications, such as visual tracking and video surveillance. Although existing deep learning-based methods have achieved promising...
详细信息
ISBN:
(纸本)9781665475921
Single image desnowing is an important and challenge task for lots of computer vision applications, such as visual tracking and video surveillance. Although existing deep learning-based methods have achieved promising results, most of them rely on the local deep features and neglect global relationship information between the local regions. Therefore, inevitably leading to over-smooth or detail loss results. To solve this issue, we design a UNet-based end-to-end architecture for image desnowing. Specially, to better characterize global information and preserve image detail, we combine Window-based Self-Attention (WSA) transformer block with Residue Spatial Attention (RSA) to build basic unit of our network. Besides, to protect the structure of the image effectively, we also introduce a Residue Channel (RC) loss to guide high-quality image restoration. Extensive experimental results on both synthetic and real-world datasets demonstrate that the proposed model achieves new state-of-the-art results.
In this study, the alignment of video-text and image-text datasets is studied. Firstly, similarities are calculated over the texts in the two data sets. A retrieval setup with visual similarities is then applied to th...
详细信息
ISBN:
(纸本)9798350343557
In this study, the alignment of video-text and image-text datasets is studied. Firstly, similarities are calculated over the texts in the two data sets. A retrieval setup with visual similarities is then applied to the subset which is created via calculated text similarities. A BERT-based embedding vector method is applied to the raw and pure texts. As a visual feature, object-based and CLIP-based methods are used to define video frames. According to the results, alignment with CLIP features achieves the best results in the subset created by filtering using raw text.
Recently, the pre-processed video transcoding has attracted wide attention and has been increasingly used in practical applications for improving the perceptual experience and saving transmission resources. However, v...
详细信息
ISBN:
(纸本)9781728185514
Recently, the pre-processed video transcoding has attracted wide attention and has been increasingly used in practical applications for improving the perceptual experience and saving transmission resources. However, very few works have been conducted to evaluate the performance of pre-processing methods. In this paper, we select the source (SRC) videos and various pre-processing approaches to construct the first Pre-processed and Transcoded Video Database (PTVD). Then, we conduct the subjective experiment, showing that compared with the video sent to the codec directly at the same bitrate, the appropriate pre-processing methods indeed improve the perceptual quality. Finally, existing image/video quality metrics are evaluated on our database. The results indicate that the performance of the existing image/video quality assessment (IQA/VQA) approaches remain to be improved. We will make our database publicly available soon.
暂无评论