Computer vision gives us the ability to detect and analyze human actions from images and videos. Artificial Intelligence combined with computer vision can be used to perform certain tasks based on the information dete...
详细信息
Jingdezhen ceramics have a long history and are world-famous, and thus often become the object of imitation. Aiming at the current ceramic anti-counterfeiting traceability technology is not precise enough, a new treat...
详细信息
Effective preprocessing of image data plays a pivotal role in enhancing the discriminative modeling capabilities in downstream machine learning tasks. This study investigates the significance of adequately mapping ima...
详细信息
ISBN:
(纸本)9781510673991;9781510673984
Effective preprocessing of image data plays a pivotal role in enhancing the discriminative modeling capabilities in downstream machine learning tasks. This study investigates the significance of adequately mapping image data into a new feature space during the preprocessing phase, emphasizing its criticality in facilitating more robust and accurate models. While traditional methods such as signal/imageprocessing transforms have been previously explored for this purpose, this study introduces a novel approach leveraging deep learning techniques. Specifically, convolutional and pooling layers are employed to process the image data, offering a more sophisticated and adaptive method for feature extraction and representation. By employing deep learning architectures, the preprocessing phase becomes more flexible and capable of capturing intricate patterns and structures within the data. Through empirical evaluation, our approach demonstrates significant improvements in discriminative modeling across various traditional machine learning approaches. This highlights the effectiveness and versatility of deep learning-based preprocessing in enhancing the performance of downstream tasks, showcasing its potential to advance the field of image data processing and analysis.
In this study, we introduce an intelligent Test Time Augmentation (TTA) algorithm designed to enhance the robustness and accuracy of image classification models against viewpoint variations. Unlike traditional TTA met...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
In this study, we introduce an intelligent Test Time Augmentation (TTA) algorithm designed to enhance the robustness and accuracy of image classification models against viewpoint variations. Unlike traditional TTA methods that indiscriminately apply augmentations, our approach intelligently selects optimal augmentations based on predictive uncertainty metrics. This selection is achieved via a two-stage process: the first stage identifies the optimal augmentation for each class by evaluating uncertainty levels, while the second stage implements an uncertainty threshold to determine when applying TTA would be advantageous. This methodological advancement ensures that augmentations contribute to classification more effectively than a uniform application across the dataset. Experimental validation across several datasets and neural network architectures validates our approach, yielding an average accuracy improvement of 1.73% over methods that use single-view images. This research underscores the potential of adaptive, uncertainty-aware TTA in improving the robustness of image classification in the presence of viewpoint variations, paving the way for further exploration into intelligent augmentation strategies. The code is available at: https://***/olivesgatech/Intelligent-Multi-View-TTA
To ensure robust, high-capacity, and secure communication, we propose a conditional diffusion model for coverless image steganography, called CDIS, which not only generates realistic stego images but also successfully...
详细信息
This research explores the effectiveness of fine-tuning the Stable Diffusion model to generate ceramic images. We constructed a ceramic image-text pair dataset to fine-tune the model, to assess the effectiveness of th...
详细信息
India holds the title of being the top banana producer globally, contributing approximately 25% of the total banana production. However, exporting it can be a challenge because of its shelf-life. To propose the best p...
详细信息
ISBN:
(纸本)9783031581731;9783031581748
India holds the title of being the top banana producer globally, contributing approximately 25% of the total banana production. However, exporting it can be a challenge because of its shelf-life. To propose the best possible shelf-life extension methodology, it is important to classify based on the banana varieties and ripening stages to ensure sustainable growth and nutritional value. There are still not enough data sets with different varieties of bananas and their respective ripening stages. A review of research publications from the last five years has been conducted using electronic databases like Scopus, Google Scholar, and Research-Gate, as well as the details of publicly accessible dataset repository sites. The dataset captures images of different varieties of banana fruit as well as its respective different stages of ripening. Banana varieties considered include Robusta (MusaAA), Dwarf Cavendish (Musaacuminata), Nanjangud bananas, and Red bananas (Musa acuminata). The dataset contains over 41,900 processed images. In this paper, the authors provide researchers with an opportunity to develop and investigate machine learning and deep learning algorithms that are used to predict and extend the shelf life of banana fruits.
In aerosol printing, the conductive patterns may suffer from defects that affect their electrical performance due to equipment failure, ink viscosity, and other factors. OpenCV, an open-source computer vision library ...
详细信息
Fuzzy image refers to the problems of image blur, noise, and distortion caused by many situations during the acquisition, transmission and processing. Restoration of fuzzy images is an important problem in the field o...
详细信息
Accurate image-based prediction is critical for diagnosing breast cancer. Current diagnostic practices depend on interpreting a variety of data types, such as pathology reports, MRI, and ultrasound images, making the ...
详细信息
ISBN:
(纸本)9798400717499
Accurate image-based prediction is critical for diagnosing breast cancer. Current diagnostic practices depend on interpreting a variety of data types, such as pathology reports, MRI, and ultrasound images, making the accurate interpretation of these images by physicians essential. This study aims to enhance diagnostic accuracy by developing a high-performance tool using a novel deep learning model, the vision Transformer (ViT_B_16). Five distinct breast cancer image datasets were selected for model training, optimizing the ViT_B_16 Transformer model's parameters. The results demonstrated that the Transformer model achieved an impressive accuracy of 0.99 and a minimal loss score of 0.01. Additionally, for mixed-type datasets, the Transformer model exhibited significant potential in breast cancer diagnosis, outperforming the FasterRCNN model.
暂无评论