In this study, we introduce an intelligent Test Time Augmentation (TTA) algorithm designed to enhance the robustness and accuracy of image classification models against viewpoint variations. Unlike traditional TTA met...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
In this study, we introduce an intelligent Test Time Augmentation (TTA) algorithm designed to enhance the robustness and accuracy of image classification models against viewpoint variations. Unlike traditional TTA methods that indiscriminately apply augmentations, our approach intelligently selects optimal augmentations based on predictive uncertainty metrics. This selection is achieved via a two-stage process: the first stage identifies the optimal augmentation for each class by evaluating uncertainty levels, while the second stage implements an uncertainty threshold to determine when applying TTA would be advantageous. This methodological advancement ensures that augmentations contribute to classification more effectively than a uniform application across the dataset. Experimental validation across several datasets and neural network architectures validates our approach, yielding an average accuracy improvement of 1.73% over methods that use single-view images. This research underscores the potential of adaptive, uncertainty-aware TTA in improving the robustness of image classification in the presence of viewpoint variations, paving the way for further exploration into intelligent augmentation strategies. The code is available at: https://***/olivesgatech/Intelligent-Multi-View-TTA
Deep clustering algorithms using deep neural networks are crucial across various fields. Contrastive learning benefited by diverse data augmentations has proven effective in improving clustering performance. However, ...
详细信息
Deep clustering algorithms using deep neural networks are crucial across various fields. Contrastive learning benefited by diverse data augmentations has proven effective in improving clustering performance. However, existing clustering methods based on contrastive learning primarily focus on similarities among augmented views of the same instance, often overlooking the rich semantic information inherent in the image itself. Therefore, this paper proposes an effective deep clustering method called Edge-Constraint based Multi-Scale Contrastive Learning for image Deep Clustering. Based on the traditional two-channel paradigm, an edge channel is constructed to capture contour information and an edge constraint loss is generated for complementary contrastive learning by quantifying the structural similarity between edge imagesignals and augmented imagesignals from other channels. Additionally, a multi-scale feature enhancement module is proposed to improve the robustness of edge information extraction in a multi-scale environment. The experimental results show that the method proposed in this paper outperforms the current state-of-the-art clustering approaches in the field on the benchmarks of CIFAR-10, CIFAR-100, STL-10, imageNet-10, and Tiny-imageNet.
Pest's infection affects the crop production and annual income. From the past decade, many traditional methods anticipated the optimum accuracy while categorizing the infected tomato-plants. Every technique has th...
详细信息
Pest's infection affects the crop production and annual income. From the past decade, many traditional methods anticipated the optimum accuracy while categorizing the infected tomato-plants. Every technique has their pros and specifically the cons. As an upgradation, this paper introduces appropriate unsupervised detection & categorization of the diseased/healthy tomato plant using neural-net techniques. image dataset is congregation of both online and naturally accessible samples for healthy & diseased tomato crops. The current algorithm executes three steps to attain utmost performance: (i) Data pre-processing using Non-Subsampled Contourlet to acquire energy-detail components, (ii) Modified K-means processing to extract colored clusters, that are in-turn utilized for tomato-leaf detection, and (iii) finally Modified Convolution-neural Network features are fused to SVM for automated categorization. The work was tested for Kaggle PlantVillage and Mendeley datatset constituting 20,283 images, forming one healthy and 10 disease classes. The model undergoes the subjective performance metric evaluation and achieved the model accuracy as 99.15% and average precision of 95.6%. Technique produces highly intense, automatic and accurate classifier results over state-of-the-art approaches.
Saving and transmitting high-resolution (HR) images are often demanded in real life, especially in social media applications. The recently developed image rescaling techniques provide a storage and transmission econom...
详细信息
Saving and transmitting high-resolution (HR) images are often demanded in real life, especially in social media applications. The recently developed image rescaling techniques provide a storage and transmission economic way to deal with this problem, by jointly learning the downscaling and upscaling mappings for the images with the aid of the invertible neural network (INN). In the original pipeline, the high-frequency information is generally discarded in the downscaled low-resolution (LR) image, while is randomly sampled when doing upscaling, so that the storage and transmission cost can be minimal. However, the quality of the reconstructed HR image is limited due to the ignorance of the high-frequency information, and thus there are researchers trying to improve the upscaling performance by paying a bit more storage to partially save the high-frequency features. In this work, following this research line, we propose a new strategy to improve the image rescaling performance by more efficiently utilizing additional storage. Specifically, instead of saving the partial high-frequency features, we propose to quantize those features with a learned codebook and save the corresponding index matrix. Such a vector quantization strategy can recover as much as possible high-frequency features, and thus leads to a better image rescaling performance. Besides, the additional storage cost is the same or can be even less compared with existing methods. Experiments on a series of benchmark datasets demonstrate the effectiveness of the proposed method against current state-of-the-art ones.
Speckle removal is a crucial preliminary step for synthetic aperture radar (SAR) imageprocessing. In recent years, the application of deep neural networks toward solving SAR image despeckling problems has yielded com...
详细信息
Speckle removal is a crucial preliminary step for synthetic aperture radar (SAR) imageprocessing. In recent years, the application of deep neural networks toward solving SAR image despeckling problems has yielded commendable outcomes. However, prevailing deep learning methods for SAR image despeckling rely on convolutional neural network (CNN) architectures, which inherently capture only local information within their receptive fields. Consequently, the despeckling performance can be further improved through advanced network structure design. More recently, the transformers-based methods show impressive performance in natural image denoising relying on the long-range dependency modeling capability of the self-attention mechanism. In this letter, we introduce a practical despeckling network that incorporates the global modeling capability of the Swin transformers (SwinTs) and the local modeling ability of the residual CNNs. Specifically, the proposed despeckling network is based on the widely used U-Net architecture, wherein a Swin Conv (SC) block is adopted to replace the convolutional layer in the baseline UNet. The SC block mainly comprises a residual convolutional (RConv) block and a SwinT block, which are used to extract local features and long-range dependencies from images, respectively. Furthermore, to deal with the spatially correlated real SAR speckle, a pixel-shuffle downsampling (PD) post-processing strategy is adopted, which can significantly improve the practicality of the proposed method without additional real dataset for fine-tuning. Experimental results demonstrate that the proposed method achieves the state-of-the-art performance on both synthetic and real SAR images, and outperforms CNN-based and transformer-based methods ( $L=1$ ) by an average peak signal-to-noise ratio (PSNR) of 0.94 and 1.58 dB, respectively.
Folk art is an important manifestation of culture and heritage of a region. However, as these artworks are not properly cared for, so degrades very fast and lost permanently over the time. Restoration and proper prese...
详细信息
Folk art is an important manifestation of culture and heritage of a region. However, as these artworks are not properly cared for, so degrades very fast and lost permanently over the time. Restoration and proper preservation of these kind of artworks are very much essential. neural networks have emerged as powerful tools in imageprocessing, overpowering the classical methods, especially in pattern recognition. Convolutional neural Network (CNN) is among the most utilised deep neural networks. In machine learning issues, the CNN performs admirably. In this paper, a comparative performance analysis of CNN and its combination with BBHE (Brightness preserving Bi-Histogram Equalization), fuzzy technique and PSO (Particle Swarm Optimization) is presented. The performance has been measured on the basis of CPP (Contrast per Pixel), AMBE (Absolute Mean Brightness Error), NMSE (Normalized Mean Square Error), IEF (image Enhancement Factor), PSNR (Peak to signal Noise Ratio), r (Pearson Correlation Coefficient) and SSIM (Structural Similarity Index).
Blind image Quality Assessment (BIQA) is an essential task that estimates the perceptual quality of images without reference. While many BIQA methods employ deep neural networks (DNNs) and incorporate saliency detecto...
详细信息
ISBN:
(纸本)9798350367331;9798350367348
Blind image Quality Assessment (BIQA) is an essential task that estimates the perceptual quality of images without reference. While many BIQA methods employ deep neural networks (DNNs) and incorporate saliency detectors to enhance performance, their large model sizes limit deployment on resource-constrained devices. To address this challenge, we introduce a novel and non-deep-learning BIQA method with a lightweight saliency detection module, called Green Saliency-guided Blind image Quality Assessment (GSBIQA). It is characterized by its minimal model size, reduced computational demands, and robust performance. Experimental results show that the performance of GSBIQA is comparable with state-of-the-art DL-based methods with significantly lower resource requirements.
Ultrasound image segmentation research is of great significance, especially in medical diagnosis and clinical treatment. However, speckle noise caused by reflections from different tissue types, irregular organ or les...
详细信息
Ultrasound image segmentation research is of great significance, especially in medical diagnosis and clinical treatment. However, speckle noise caused by reflections from different tissue types, irregular organ or lesion shapes, and artifacts from uneven signal reflections pose significant challenges to the segmentation task. Convolutional neural networks (CNNs) have been proven effective for ultrasound image segmentation tasks. Inspired by multi-path feature fusion and traditional enhancement techniques, this paper proposes a Dual- Branch Ultrasound image Segmentation Network (DBUNet) based on U-Net architecture. The network consists of four main components: an enhanced branch, an original branch, a feature aggregation module, and a decoder block. The enhanced branch combines signalprocessing techniques such as filtering and histogram equalization with attention-based denoising. The original branch extracts comprehensive inter-image information to supplement the details lost by the enhancement operation. A Deep Feature Aggregation Module (DFAM) is designed to efficiently fuse deep features from different branches. In the DFAM, a channel reconstitution module is used to refine channel-level features first. Then, cross-fusion is applied to facilitate cross-feature information exchange and generate fusion attention maps to guide feature fusion generation. In addition, a Shallow Feature Optimization Module (SFOM) is proposed to retain important information and suppress unimportant information using a separation-reconstruction strategy to achieve spatial redundancy optimization. An attention mechanism is introduced to achieve feature denoising. This network was compared with stateof-the-art segmentation methods to evaluate its segmentation performance using five quantitative evaluation indicators on the publicly available ultrasound breast cancer dataset BUSI and the thyroid ultrasound dataset DDTI. Experimental results show that the proposed method outperforms state-o
This paper presents a fixed-time state observer-based robust adaptive neural fault-tolerant control (RANFTC) for attitude and altitude tracking and control of quadrotor unmanned aerial vehicles (UAVs), considering mul...
详细信息
This paper presents a fixed-time state observer-based robust adaptive neural fault-tolerant control (RANFTC) for attitude and altitude tracking and control of quadrotor unmanned aerial vehicles (UAVs), considering multiple actuator faults, parametric uncertainty, and unknown external disturbances simultaneously. A novel fixed-time state error estimation based on sliding mode observer is designed, which is independent of initial conditions. A proportional-integral-derivative (PID) based sliding mode control (SMC) is proposed to handle actuator faults and unknown disturbances in combination with the fixed-time observer within the fault-tolerant control (FTC) design scheme. The radial basis function neural network (RBFNN) is employed with the controller to approximate the uncertain parameters of the system. Furthermore, two new adaptive laws are designed to estimate the sudden actuator fault and the unknown upper bound of disturbances independently. Implementing these estimation schemes avoids overestimation, enhances the robustness of the presented controller, and substantially eliminates the control chattering problem. By applying the Lyapunov stability concept, the suggested control strategy guarantees that the states of the quadrotor UAV converge to the origin in a finite time. Finally, simulation studies are conducted to demonstrate the tracking performance and highlight the effectiveness of the proposed FTC design compared to the existing FTC methods. This article presents a fixed-time state observer-based robust adaptive neural fault-tolerant control for attitude and altitude tracking and control of quadrotor unmanned aerial vehicles, addressing multiple actuator faults, model uncertainty, and unknown external disturbances. image
Hyperspectral unmixing is an essential tool for analyzing hyperspectral data, especially in remote sensing. Many approaches have been developed for this problem, ranging from model-based to deep learning-based, and (h...
详细信息
ISBN:
(纸本)9789464593617;9798331519773
Hyperspectral unmixing is an essential tool for analyzing hyperspectral data, especially in remote sensing. Many approaches have been developed for this problem, ranging from model-based to deep learning-based, and (hybrid) unrolled methods. However, the development of supervisedly trained deep learning-based unmixing methods is hindered by the lack of available labeled training datasets. In this paper, to enable the supervised training of neural networks for hyperspectral unmixing, we propose a methodology to construct a synthetic training database directly from the hyperspectral image to unmix. We use this data generation approach to train an unrolled unmixing method LPALM. The trained LPALM is assessed on two real hyperspectral datasets and shows the best performances compared to other classical, unrolled, and autoencoder-based unmixing methods. The code of this work will be available at https://***/rhadjeres/***.
暂无评论