Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual per...
详细信息
ISBN:
(纸本)9781728163956
Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how the system processes different perturbations in order to replicate to what extent it determines our ability to judge image quality. While recent works have presented deep neural networks trained to predict human perceptual quality, very few borrow any intuitions from the human visual system. To address this, we present PerceptNet, a convolutional neural network where the architecture has been chosen to reflect the structure and various stages in the human visual system. We evaluate PerceptNet on various traditional perception datasets and note strong performance on a number of them as compared with traditional image quality metrics. We also show that including a nonlinearity inspired by the human visual system in classical deep neural networks architectures can increase their ability to judge perceptual similarity. Compared to similar deep learning methods, the performance is similar, although our network has a number of parameters that is several orders of magnitude less.
In underwater photography, the absorption and scattering of light may probably cause low contrast, blurred images, and color cast. We present an effective method for improving the quality of underwater images that hav...
详细信息
In underwater photography, the absorption and scattering of light may probably cause low contrast, blurred images, and color cast. We present an effective method for improving the quality of underwater images that have been degraded by medium scattering and absorption. It combines color compensation with multi-scale image fusion. Color compensation consists of independently correcting the value of $\mathbf{R}$ channel and G-B channel of the input image and adjusting the white balance of the corrected image. After color compensation, the color of degraded image is restored effectively, but the blurring of image edges and details receiving scattering effects cannot be remedied. This problem can be solved by multi-scale image fusion. In our method, we use Gaussian pyramid and Laplace pyramid images for fusion, and these two feature pyramids are constructed by using the image after the illumination adjustment. The experimental results illustrate that dark-area sensitivity, global contrast, and edge sharpness of images have been improved by our methods. Moreover, in terms of visual effects and assessment metrics of underwater images, our method excels over other methods.
Cervical cancer, caused by the skin infection of the Human papillomavirus (HPV), ranked as the fourth most prevalent cancer globally, has profound and sometimes fatal implications for women. With approximately 500,000...
详细信息
ISBN:
(数字)9798350383652
ISBN:
(纸本)9798350383669
Cervical cancer, caused by the skin infection of the Human papillomavirus (HPV), ranked as the fourth most prevalent cancer globally, has profound and sometimes fatal implications for women. With approximately 500,000 new cases reported annually, the disease significantly impacts women's overall well-being. Ongoing efforts by researchers focus on developing a more cost-effective solution than regular HPV testing to manage and control this widespread health concern. Our research included an innovative framework for identifying and classifying stages of cervical cancer by processing the images of the tumours. To optimize the performance of our model, we developed a unique approach for selecting the best image data points by calculating a score, which is defined as the product of focus, sharpness, and resolution. Additionally, our study distinguishes itself by integrating three diverse feature extraction techniques–Local Binary Pattern (LBP), ResNet, and VGGNet–an inclusive strategy not commonly observed in similar projects. We even added deep convolutional architectures and machine learning techniques, including Random Forest, SVM, Decision Tree, XGBoost, a basic Neural Network Classifier, and a Multi-layered Perceptron (MLP). This hybrid approach allowed for an intrinsic comparison, identifying the most effective strategy among these. In contrast to current state-of-the-art methods, our approach yielded around 75% accuracy and 81% AUC score, which is greater than the existing studies on such dataset.
Recent years have seen significant progress in learning based approaches for 3D reconstruction. While different kind of representations have been tried, signed distance fields start to gain interest as they are easier...
详细信息
Coded aperture snapshot spectral imaging (CASSI) captures a full frame spectral image as a single compressive image and is mandatory to reconstruct the underlying hyperspectral image (HSI) from the snapshot as the pos...
详细信息
ISBN:
(纸本)9781728188089
Coded aperture snapshot spectral imaging (CASSI) captures a full frame spectral image as a single compressive image and is mandatory to reconstruct the underlying hyperspectral image (HSI) from the snapshot as the post-processing, which is a challenge inverse problem due to its ill-posed nature. Existing methods for HSI reconstruction from a snapshot usually employs optimization for solving the formulated i mage degradation model regularized with the empirically designed priors, and still cannot achieve enough reconstruction accuracy for real HS image analysis systems. Motivated by the recent advances of deep learning for different inverse problems, deep learning based HSI reconstruction method has attracted a lot of attention and can boost the reconstruction performance. This study proposes a novel deep convolutional neural network (DCNN) based framework for effectively learning the spatial structure and spectral attribute in the underlying HSI with the reciprocal spatial and spectral modules. Further, to adaptively leverage the useful learned feature for better HSI i mage reconstruction, we integrate residual attention modules into our DCNN via exploring both spatial and spectral attention maps. Experimental results on two benchmark HSI datasets show that our method outperforms state-of-the-art methods in both quantitative values and visual effects.
Versatile Video Coding (VVC) is a new international video coding standard. One of the functionalities that VVC supports is so called Gradual Decoding Refresh (GDR). GDR is mainly for (ultra) low-delay applications. As...
详细信息
ISBN:
(纸本)9781728173221
Versatile Video Coding (VVC) is a new international video coding standard. One of the functionalities that VVC supports is so called Gradual Decoding Refresh (GDR). GDR is mainly for (ultra) low-delay applications. As the latest video coding standard, VVC employs many new and advanced coding tools. Among them is HMVP (History-based Motion Vector Prediction), which however can cause leaks for GDR applications. This paper analyzes the leak problem associated with HMVP for GDR and proposes suggestions on how to use HMVP for GDR applications.
Thanks to the city brand strategy, more and more countries and cities carry out self marketing. In this regard, branding urban space is a key subdivision to achieve a place's publicity. Therefore, from the perspec...
详细信息
Skin cancer is recognized as one of the most lethal forms of cancer, often resulting in death if not detected early. Its aggressive tendency to spread to other parts of the body makes treatment in later stages particu...
详细信息
ISBN:
(数字)9798331528348
ISBN:
(纸本)9798331528355
Skin cancer is recognized as one of the most lethal forms of cancer, often resulting in death if not detected early. Its aggressive tendency to spread to other parts of the body makes treatment in later stages particularly difficult, emphasizing the crucial need for early detection. As the incidence of skin cancer continues to rise, along with its associated high mortality rates and treatment expenses, there is an urgent demand for more effective early detection methods. Leveraging deep learning algorithms through a model-driven architecture has emerged as a promising solution for improving skin cancer diagnosis. In this work, present the development and implementation of a deep learning model designed to accurately classify images of skin lesions. The proposed approach uses the Xception model, an advanced convolutional neural network (CNN) architecture. Through training, the Xception model identifies patterns and features in skin lesion images that indicate whether the condition is benign or malignant. Proposed model achieved an accuracy rate of 86% in the classification task.
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limi...
详细信息
ISBN:
(纸本)9781728173221
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limited robustness and generalization ability. Moreover, they often rely on side information not available at test time, that is, they are not universal. We investigate these problems and propose a new GAN image detector based on a limited sub-sampling architecture and a suitable contrastive learning paradigm. Experiments carried out in challenging conditions prove the proposed method to be a first step towards universal GAN image detection, ensuring also good robustness to common image impairments, and good generalization to unseen architectures.
The modern digital space is saturated with a huge amount of data in the form of images and videos every day. All information contained is important to users, organizations and other consumers. It should be noted that ...
详细信息
暂无评论