Detecting building change in bitemporal remote sensing (RS) imagery requires a model to highlight the changes in buildings and ignore the irrelevant changes of other objects and sensing conditions. Buildings have comp...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Detecting building change in bitemporal remote sensing (RS) imagery requires a model to highlight the changes in buildings and ignore the irrelevant changes of other objects and sensing conditions. Buildings have comparatively less diverse textures than other objects and appear as repetitive visual patterns on RS images. In this paper, we propose Gabor Feature Network (GFN) to extract the distinctive repetitive texture features of buildings. Furthermore, we also design Feature Fusion Module (FFM) to fuse the extracted multiscale features from GFN with the features from a Transformer-based encoder to pass on the texture features to different parts of the model. Using GFN and FFM, we design a Transformer-based model, called GabFormer for building change detection. Experimental results on the LEVIR-CD and WHU-CD datasets indicate that GabFormer outperforms other SOTA models and in particular show significant improvement in the generalization capability. Our code is available on https://***/Ayana-Inria/GabFormer.
Proliferative Diabetic Retinopathy (PDR) is a serious retinal disease threatening diabetic patients. Intense retinal neovascularization in the retinal image is the most important clinical symptom of PDR, leading to vi...
详细信息
ISBN:
(纸本)9781665475921
Proliferative Diabetic Retinopathy (PDR) is a serious retinal disease threatening diabetic patients. Intense retinal neovascularization in the retinal image is the most important clinical symptom of PDR, leading to visual distortion if not controlled. Accurate and timely detection of neovascularization from retinal images allows patients to receive adequate treatment to avoid further vision loss. In this work, we propose a retinal neovascularization automatic segmentation model based on improved Pyramid Scene Parsing Network (PSP-Net). To improve the accuracy of the model, we introduce the proposed channel attention module into the model. The network is evaluated with color fundus images from practice. Evaluation results show the network is superior to FCN, SegNet, U-Net and PSP-Net in accuracy and sensitivity. The model could achieve accuracy, sensitivity, specificity, precision and Jaccard similarity score of 0.9832, 0.9265, 0.9897, 0.9116 and 0.8501, respectively. This paper proves through plenty of experimental results that the network model is able to improve the accuracy of segmentation, relieve the workload of doctors, and is worthy of further clinical promotion.
Rate-distortion optimization through neural networks has accomplished competitive results in compression efficiency and image quality. This learning-based approach seeks to minimize the compromise between compression ...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Rate-distortion optimization through neural networks has accomplished competitive results in compression efficiency and image quality. This learning-based approach seeks to minimize the compromise between compression rate and reconstructed image quality by automatically extracting and retaining crucial information, while discarding less critical details. A successful technique consists in introducing a deep hyperprior that operates within a 2-level nested latent variable model, enhancing compression by capturing complex data dependencies. This paper extends this concept by designing a generalized L-level nested generative model with a Markov chain structure. We demonstrate as L increases that a trainable prior is detrimental and explore a common dimensionality along the distinct latent variables to boost compression performance. As this structured framework can represent autoregressive coders, we outperform the hyperprior model and achieve state-of-the-art performance while reducing substantially the computational cost. Our experimental evaluation is performed on wind turbine scenarios to study its application on visual inspections.
Breast cancer is one of the most common types of cancer among women, and early diagnosis is essential for effective treatment. In recent years, deep learning techniques have shown promising results in solving image cl...
详细信息
ISBN:
(纸本)9798350343557
Breast cancer is one of the most common types of cancer among women, and early diagnosis is essential for effective treatment. In recent years, deep learning techniques have shown promising results in solving image classification problems. In this study, labeled image patches obtained from whole slide images in the BACH dataset were used for breast cancer classification. The performances of pre-trained deep learning models for breast cancer classification on labeled image patches were compared. Vision Transformer (ViT) gives better performance with the local features extracted by focusing on the important regions in the image due to the transformer structure and its Attention mechanism. Moreover, the model extending the Xception backbone architecture with Bidirectional Long Short-Term Memory (BiLSTM) layers, XceptionBiLSTM, significantly improved classification performance by learning the spatial relationships between image patches. Furthermore, data augmentation techniques were applied to dataset containing a limited number of image patches, which increased the models' generalization capacity and prevented overfitting. With the proposed architecture, which is the result of combining the predictions of the individual Xception, ViT, and XceptionBiLSTM models used with the ensemble learning approach, 90% accuracy in the hidden test dataset, 95% accuracy and 94.9% F1-Score in the validation dataset were obtained. The results obtained demonstrate the significant potential of the proposed ensemble learning-based architecture for breast cancer classification.
A novel image subspace reconstruction has been proposed. The suggested inverse problem regularization algorithm approximates a class of differentials of function through multi-resolution density measure over level set...
详细信息
This article introduces a novel multi-modal image fusion approach based on Convolutional Block Attention Module and dense networks to enhance human perceptual quality and information content in the fused images. The p...
详细信息
ISBN:
(纸本)9783031585340;9783031585357
This article introduces a novel multi-modal image fusion approach based on Convolutional Block Attention Module and dense networks to enhance human perceptual quality and information content in the fused images. The proposed model preserves the edges of the infrared images and enhances the contrast of the visible image as a pre-processing part. Consequently, the use of Convolutional Block Attention Module has resulted in the extraction of more refined features from the source images. The visual results demonstrate that the fused images produced by the proposed method are visually superior to those generated by most standard fusion techniques. To substantiate the findings, quantitative analysis is conducted using various metrics. The proposed method exhibits the best Naturalness image Quality Evaluator and Chen-Varshney metric values, which are human perception-based parameters. Moreover, the fused images exhibit the highest Standard Deviation value, signifying enhanced contrast. These results justify the proposed multi-modal image fusion technique outperforms standard methods both qualitatively and quantitatively, resulting in superior fused images with improved human perception quality.
Perceptual image hashing is pivotal in various imageprocessing applications, including image authentication, content-based image retrieval, tampered image detection, and copyright protection. This paper proposes a no...
详细信息
Nowadays, learned image compression has outperformed traditional coding methods like VVC. However, learned image compression methods are often optimized for specific rates and lack support for variable rate compressio...
详细信息
image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images. Current methods adopt either global-level or pixel-level fe...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images. Current methods adopt either global-level or pixel-level feature matching. Global-level feature matching ignores the proximity prior, treating foreground and background as separate entities. On the other hand, pixel-level feature matching loses contextual information. Therefore, it is necessary to use the information from semantic maps that describe different objects to guide harmonization. In this paper, we propose Semantic-guided Region-aware Instance Normalization (SRIN) that can utilize the semantic segmentation maps output by a pre-trained Segment Anything Model (SAM) to guide the visual consistency learning of foreground and background features. Abundant experiments demonstrate the superiority of our method for image harmonization over state-of-the-art methods.
Infrared and visible image fusion is an important multimodal imageprocessing task that aims to enhance computer vision performance by effectively fusing infrared and visible images. Although in recent years, many dee...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Infrared and visible image fusion is an important multimodal imageprocessing task that aims to enhance computer vision performance by effectively fusing infrared and visible images. Although in recent years, many deep learning-based methods for infrared and visible image fusion have emerged. Howeve, most of these methods ignore the important role of semantic information in image fusion. Therefore, this paper proposes a semantic priori guided infrared and visible image fusion network called SPGFusion. It uses an adversarial generative network framework based on semantic priors to guide the infrared and visible image fusion process by combining a semantic feature-aware module and semantic generative adversarial loss. Experimental results demonstrate that the SPGFusion method yields more visually appealing fusion results and outperform state-of-the-art image fusion algorithms in visual quality and quantitative evaluation. The source code is available at https://***/tianzhiya/SPGFusion.
暂无评论