Recently, due to the rapid development of generative AI technologies, the use of AI-generated images has increased significantly, making the distinction between real and fake images crucial. Generative images may be u...
详细信息
ISBN:
(纸本)9783031664304;9783031664311
Recently, due to the rapid development of generative AI technologies, the use of AI-generated images has increased significantly, making the distinction between real and fake images crucial. Generative images may be used in various ways such as data training and fast image generation, but a potential for misuse, such as in deep fake or spreading false information, still exists. This study explores a novel model using the architecture of Swin-Transformer to distinguish between fake and real images generated based on CNN (Convolutional neuralnetworks) and GAN (Generative Adversarial networks). The Swin-Transformer, a successor model of Vision in Transformer (ViT), applies the structure of the Transformer, which has shown outstanding performance in natural language processing, to the field of images and demonstrates excellent pixel-level segmentation performance. Real and fake images require detailed pixel-level analysis, in which the Swin-Transformer exhibits higher accuracy. Improving the performance of distinguishing between real and fake images is expected to set limits on indiscreet image generation, bringing further effects such as preventing the indiscriminate use of AI images through program-based discrimination/legal sanctions.
The work explores the use of artificial intelligence and quantum computing for UAV video processing. A method of image preprocessing, feature detection and subsequent application of convolutional networks is proposed....
详细信息
This paper considers lossless image compression and presents a learned compression system that can achieve state-of-the-art lossless compression performance but uses only 59K parameters, which is one or two order of m...
详细信息
This paper considers lossless image compression and presents a learned compression system that can achieve state-of-the-art lossless compression performance but uses only 59K parameters, which is one or two order of magnitudes less than other learned systems proposed recently in the literature. The explored system is based on a learned pixel-by-pixel lossless image compression method, where each pixel's probability distribution parameters are obtained by processing the pixel's causal neighborhood (i.e. previously encoded/decoded pixels) with a simple neural network comprising 59K parameters. This causality causes the decoder to operate sequentially, i.e. the neural network has to be evaluated for each pixel sequentially, which increases decoding time significantly with common GPU software and hardware. To reduce the decoding time, parallel decoding algorithms are proposed and implemented. The obtained lossless image compression system is compared to traditional and learned systems in the literature in terms of compression performance, encoding-decoding times and computational complexity.
The surveillance and management of cargo fleets is a crucial objective of intelligent transportation systems. Load, especially overload, has a destructive effect on roads and bridges, and monitoring it can increase th...
详细信息
The surveillance and management of cargo fleets is a crucial objective of intelligent transportation systems. Load, especially overload, has a destructive effect on roads and bridges, and monitoring it can increase the life of road surface and its structure. For low-end hardware with lack of CPU power and no GPU support, this paper presents a rapid method to detect whether heavy vehicles have loads or not;then it proposes a fast method for classifying load types to distinguish soil and construction waste from other miscellaneous loads for heavy weight vehicles. This paper applies a method for classifying cargo types using imageprocessing and texture image classification. This method extracts features for statistical analysis of texture images based on gray-level co-occurrence matrices and local binary patterns. The classification is carried out by support vector machine, k-nearest neighborhood, K-mean, artificialneuralnetworks and random forest classifiers. A large number of positive and negative patterns have been used to train these classifiers. We compare the performance of proposed extracted features and classifiers. The simulation results demonstrate that soil and construction waste can be identified from other miscellaneous loads effective in real-time implementation.
General Adversarial networks (GANs) have emerged as a powerful framework for generating reliable and transformative synthetic data in areas such as image generation, image and text synthesis, and data augmentation. Th...
详细信息
With the continuous renovation of computer vision and deep learning technologies, object detection has emerged as a crucial research orientation in the contemporary field of artificial intelligence. Among numerous det...
详细信息
ISBN:
(纸本)9798350377255;9798350377262
With the continuous renovation of computer vision and deep learning technologies, object detection has emerged as a crucial research orientation in the contemporary field of artificial intelligence. Among numerous detection targets, small objects pose a technical challenge for detection due to their small pixel proportion, scarce information content, and vulnerability to external interferences and occlusions. This paper conducts an in-depth exploration of the key technologies for small object detection and proposes a small object detection algorithm model of convolutional neural network based on receptive fields. Through the improvement of the Inception module and the design of a novel receptive field enhancement structure, the convolutional neural network can cover the small object areas in the image more effectively during the feature extraction process, thereby enhancing the recognition ability for small objects. Simultaneously, a feature fusion module is established. The core mechanism of this module lies in adopting a multi-scale approach to transform and integrate the feature points extracted from each convolutional layer. By fusing the feature data of various network layers, this model can extract the details of the image more profoundly, thereby enhancing the detection accuracy of small objects. Compared with other similar methods in VOC 2007 test set, the proposed model has increased the mAP value by 0.5% to 18.4%, while maintaining a faster detection rate. On the MS COCO dataset, compared with other algorithms, the proposed algorithm has achieved an improvement of 1.6%similar to 13.2% in mAP value, which fully verifies the effectiveness and practical value of the algorithm.
Hyperspectral image (HSI) classification is valuable in remote sensing due to its rich spectral and spatial information. In the last decade, deep learning methods, especially Convolutional neuralnetworks (CNNs), have...
详细信息
ISBN:
(纸本)9798350350920
Hyperspectral image (HSI) classification is valuable in remote sensing due to its rich spectral and spatial information. In the last decade, deep learning methods, especially Convolutional neuralnetworks (CNNs), have revolutionized HSI classification by extracting intangible semantic features and maintaining the spatial structure during feature extraction. However, the efficacy of these techniques can be constrained by the limited availability of labeled samples in HSI data. To address the issue of small-sample HSI classification, a Lightweight Multiscale Feature Fusion Network (L-MFFN) is introduced. The Multiscale Feature Extraction Module (MFEM) and the enhanced Spectral-Spatial Attention Module (SSAM) are designed and combined in L-MFFN, optimizing the use of deep and shallow features. This integration improves the extraction and fusion of multiscale spectral-spatial features, enhancing classification performance. The proposed model demonstrates state-of-the-art performance across two HSI datasets and stands out in situations with limited labeled samples, highlighting its capability to effectively tackle the challenge of small-sample HSI classification.
This work offers a comprehensive investigation of sentiment analysis in social media communication through the integration of deep learning techniques with a natural language processing (NLP) methodology. The goal of ...
详细信息
In augmented reality(AR) applications, it is a challenging task to generate virtual object shadows while maintaining the precision and consistency of virtual and real areas. To achieve the above target, we propose a l...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
In augmented reality(AR) applications, it is a challenging task to generate virtual object shadows while maintaining the precision and consistency of virtual and real areas. To achieve the above target, we propose a learnable weighted recurrent generative adversarial network(LRGAN) for end-to-end shadow generation. Without any additional computational overhead, LRGAN only needs to analyze the background context to create a bridge between the target shadows and the background. Our model incorporates multiple progressive steps to recurrently compute the precise reference masks, based on which a fine-grained shadow generation module generates the shadows. A learnable weighted fusion module, which can normalize pixel values to deal with pixel overflow, fuses the generated shadows with the original image. In addition, we adopt the combined method of module training and the whole model training. Experimental results show that our proposed LRGAN not only improves the plausibility of shadow location and shape but also achieves color harmony in the shadow areas. In the absence of other prior knowledge or post-processing, it outperforms the State-of-the-Art end-to-end methods.
Plant diseases pose significant challenges to agricultural productivity, impacting both the quality and quantity of crop yields. Early detection and effective management of these diseases are essential for mitigating ...
详细信息
暂无评论