Weakly-Supervised Semantic Segmentation (WSSS) with image-level labels, commonly uses Class Activation Maps (CAM) to generate pseudo-labels. However, Convolutional Neural Networks (CNNs), with their limited local rece...
详细信息
ISBN:
(纸本)9798350350494;9798350350500
Weakly-Supervised Semantic Segmentation (WSSS) with image-level labels, commonly uses Class Activation Maps (CAM) to generate pseudo-labels. However, Convolutional Neural Networks (CNNs), with their limited local receptive field, often struggle to identify entire object regions. Recently, the Vision Transformer (ViT) architecture has been employed instead of CNNs to capture long-range feature dependencies, by using the self-attention mechanism. Despite its advantages, ViT tends to overlook local feature details, leading to attention maps with low quality and unclear object details. This paper introduces a novel method to enhance the local details in attention maps by leveraging local patches. These local patches are selected from regions that are more likely to contain the desired objects. By effectively utilizing these local patches during the training and generation stages, the model yields more detailed attention maps. Extensive experiments were conducted on the PASCAL VOC 2012 benchmark dataset to demonstrate the efficacy of the proposed approach. The results show significant improvements (+2.6% mIoU) with minimal computational overhead, underscoring the potential of the proposed method in the field of Weakly-Supervised Semantic Segmentation.
Breast cancer is one of the most dangerous diseases among women. Different methods are used to diagnose this cancer that among these, imaging and computer-aided systems are more common. In these systems, one of the mo...
详细信息
ISBN:
(纸本)9798350350494;9798350350500
Breast cancer is one of the most dangerous diseases among women. Different methods are used to diagnose this cancer that among these, imaging and computer-aided systems are more common. In these systems, one of the most important step is preprocessing and removing unnecessary areas of the images, as well as extracting the chest area. In this paper, we present a method that consists of preprocessing, feature extraction, and using a machine learning classifier. In the preprocessing step, we propose a method to extract the region of interest in both angles of mammography images. The proposed novel method includes applying gamma correction thresholding to the images and obtaining two binary images based on the proposed threshold using the Otsu method. Results show the proposed method successfully removes the chest muscle with 98% accuracy. In the next, for feature extraction phase, we utilize three different methods for extracting features. Finally, by employing an Extra tree model classifier, we classify mammography images into normal and abnormal. By incorporating the block-based feature extraction method, we achieve 98% accuracy in classification. Overall, our approach demonstrates the effectiveness of preprocessing and feature extraction for diagnosing breast cancer using mammography images.
The detection of appearance defects in cigarettes is a key quality control link in tobacco production. Traditional detection methods rely on artificial vision, which has problems such as strong subjectivity, low effic...
详细信息
ISBN:
(纸本)9798400718212
The detection of appearance defects in cigarettes is a key quality control link in tobacco production. Traditional detection methods rely on artificial vision, which has problems such as strong subjectivity, low efficiency, and susceptibility to errors. This paper proposes a cigarette appearance defect detection method based on Artificial Intelligence (AI), aiming to improve detection efficiency and accuracy. This method utilizes imageprocessing and analysis methods in AI technology, combined with convolutional neural network algorithms, to achieve automated detection and classification of cigarette appearance defects. The experimental results indicate that this method has higher accuracy and detection efficiency.
To enhance the real-time performance and efficiency of radio signalprocessing, this paper takes OpenWifi as an example to explore the application of FPGA technology in communication systems. By implementing key signa...
详细信息
This paper proposes an algorithm combining image encryption and steganography which provides double protection for images that need to be kept secret. First, a new Three-Dimensional (3-D) fractional-order memristive H...
详细信息
ISBN:
(纸本)9798350386660;9798350386677
This paper proposes an algorithm combining image encryption and steganography which provides double protection for images that need to be kept secret. First, a new Three-Dimensional (3-D) fractional-order memristive Hindmarsh-Rose (HR) neuron model is constructed. It has highly complex and unpredictable dynamic behavior, which can produce complex state transition phenomena such as period, period-doubling bifurcation, chaos and hyperchaos. Then, the finite-field bidirectional diffusion algorithm was used to perform nonrepetitive scrambling and bidirectional diffusion of the image. Finally, the Least Significant Bit (LSB) algorithm is used to hide the encrypted image in the carrier image that is normally transmitted. Experimental results show that the algorithm has a large key space, which is robust, has high computational time efficiency, and can resist various attacks.
In this paper, a virtual view image post-processing method using background information is proposed. Firstly, a depth-based method is used to deal with overlapping between two images. Secondly, the artifacts of the le...
详细信息
We propose a pixel-level vibration imaging method for high frame rate (HFR)-video-based localization of flying objects with large movement. When the ratio of the translation speed of a target to its vibration frequenc...
详细信息
This study introduces a machine vision system integrated into cyber-physical systems (CPS) for enhanced industrial control. The system employs a specialized script for real-time imageprocessing and edge detection, wi...
详细信息
ISBN:
(纸本)9798350372977;9798350372984
This study introduces a machine vision system integrated into cyber-physical systems (CPS) for enhanced industrial control. The system employs a specialized script for real-time imageprocessing and edge detection, with a focus on precision and speed. Results showcase the system's rapid processing capabilities and high-accuracy feature detection, facilitated by machine learning algorithms that enable adaptability and iterative improvement. The system distinguishes itself by not only providing rapid and accurate feature recognition but also by outputting precise coordinates, crucial for micron-level manufacturing precision. An intuitive human-machine interface ensures seamless operation within industrial workflows. This integration significantly improves automated quality control and operational efficiency, demonstrating the system's potential to advance smart manufacturing in line with Industry 4.0 standards.
Steganography is a technology that conceals secret information in the carrier in an imperceptible way. Traditional embedding-based steganography are easily detected by steganalysis methods because of the inevitable mo...
详细信息
ISBN:
(纸本)9798350386660;9798350386677
Steganography is a technology that conceals secret information in the carrier in an imperceptible way. Traditional embedding-based steganography are easily detected by steganalysis methods because of the inevitable modifications. Steganography-without-Embedding (SwE) methods benefit from the fact that they do not modify the carrier and can resist the traditional steganalysis which distinguishs stego images from cover images. However, in practice, SwE still face the challenge of fake image *** this paper, we propose a SwE based on adversarial examples to improve steganographic security against both CS-steganalysis (distinguish cover and stego) and RF-steganalysis (distinguish real image and fake image). Two cases of RF-steganalysis, white-box and black-box, are considered. To improve the generator learning ability and the information recovery ability, we introduce the attention module in both the generator and extractor. Experimental results show that the proposed method has considerable extraction rate and could effectively resist both CS-steganalysis and RF-steganalysis.
Emotion AI is a research domain that aims to understand human emotions from visual or textual data. However, existing methods often ignore the influence of cultural diversity on emotional interpretation. In this paper...
详细信息
ISBN:
(纸本)9798350350494;9798350350500
Emotion AI is a research domain that aims to understand human emotions from visual or textual data. However, existing methods often ignore the influence of cultural diversity on emotional interpretation. In this paper, we propose a multi-modal deep learning model that integrates cultural awareness into emotion recognition. Our model uses images as the primary data source and comments from individuals across different regions as the secondary data source. Our results show that our model achieves robust performance across various scenarios. Our contribution is to introduce a novel fusion approach that bridges cultural gaps and fosters a more nuanced understanding of emotions. Due to the best of our knowledge, few works are using this approach, for Emotion AI, combining different types of data sources and models. We evaluate our model on the ArtELingo dataset, which contains image-comment pairs with Chinese, Arabic, and English annotations. The experimental results in the evaluation phase demonstrate an impressive 80% recognition accuracy for the model that merges image-text features.
暂无评论