With the rapid advancement of generative models, image detectors for AI-generated content have become an increasingly necessary technology in computer vision, attracting significant attention from researchers. This te...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
With the rapid advancement of generative models, image detectors for AI-generated content have become an increasingly necessary technology in computer vision, attracting significant attention from researchers. This technology aims to detect whether an image is naturally generated by imaging systems (e.g., digital cameras) or generated by advanced AI techniques. Despite the promising performance achieved by recent fake detection methods, they are typically trained on millions of redundant images with similar characteristics, leading to inefficient training. Furthermore, the performances of existing detectors often deteriorate when the training datasets are imbalanced. To address these challenges, we propose a novel AI-generated image detector based on dynamic aggregation and information compression with the Wasserstein distance. Experimental results show that our proposed method significantly outperforms state-of-the-art models that generalize across different generative models, with an increase of +1.86% average accuracy and +0.14% average precision, while substantially reducing the training time. On imbalanced datasets, our proposed method leads to a +14.46% accuracy improvement, clearly demonstrating its robustness on imbalanced datasets.
Tiny Machine Learning is undergoing rapid evolution in the context of edge computing and intelligent internet of Things (IoT) devices. This paper investigates the potential of model compression techniques for enhancin...
详细信息
ISBN:
(纸本)9798350351491;9798350351484
Tiny Machine Learning is undergoing rapid evolution in the context of edge computing and intelligent internet of Things (IoT) devices. This paper investigates the potential of model compression techniques for enhancing energy efficiency within IoT edge devices, as an application, hand gesture recognition based on Electrical Impedance Tomography data was used. To achieve efficient and accurate machine learning on battery-powered devices, three model compression techniques: Global pruning, Knowledge Distillation (KD), and quantization were investigated and applied on a 1D convolutional neural network. The global pruning technique resulted in a slight improvement in test data accuracy while maintaining the model size at 284.45 kB, though it increased the model's sparsity. With KD, the model size significantly decreased to 75.042 kB, with negligible impact on accuracy. In contrast, 8-bit quantization reduced the model size to 78.434 kB, but this came at the cost of a substantial 13.75% decrease in accuracy. Each model compression technique contributes to reducing model size or increasing sparsity. This is particularly beneficial for deploying deep learning models on resource-constrained IoT devices.
In the label manufacturing industry, accurately identifying label positioning defects is of great significance for ensuring product quality. However, traditional defect detection techniques have faced significant chal...
详细信息
ISBN:
(纸本)9798350350920
In the label manufacturing industry, accurately identifying label positioning defects is of great significance for ensuring product quality. However, traditional defect detection techniques have faced significant challenges due to the diversity of label materials and the complexity of imaging conditions. In recent years, semantic segmentation technology has made breakthrough progress in the field of image processing, providing innovative solutions to the aforementioned issues. This study proposes a semantic segmentation framework that integrates ResNet50 and Global Context (GCNet) modules, aiming to accurately identify label positioning defects. Through the training of a deep learning model, this framework achieves precise delineation of the label area and effectively determines positioning defects. The experiment was conducted using a dataset composed of 1653 manually annotated images, and the results revealed that the proposed algorithm has significant advantages in terms of segmentation accuracy and algorithm robustness. The algorithm developed in this study demonstrates high efficiency in detecting label positioning defects, providing an innovative technical approach for automated quality control systems.
In recent years, with the rise of deep learning, it has become a hot research topic to combine time-frequency analysis technology with deep learning to recognize radar signals. For the application of deep learning in ...
详细信息
ISBN:
(纸本)9781728190549
In recent years, with the rise of deep learning, it has become a hot research topic to combine time-frequency analysis technology with deep learning to recognize radar signals. For the application of deep learning in radar signal recognition, however, the discovery of adversarial examples poses a tremendous security risk. based on experiments, it appears that the radar signal recognition model based on the time-frequency image have been shown to be less vulnerable to adversarial attack methods based on time domain. Therefore, we propose a cross-modal attack (CMA). Firstly, we establish a surrogate model architecture locally, including three parts: time-frequency analysis, data quantization, and classifier. Secondly, we train this architecture as a whole and generate adversarial examples utilizing the trained surrogate model architecture parameters and adversarial attack methods. Finally, we carry out the CMA on the radar signal recognition model based on the time-frequency image by adding adversarial perturbations to the original signal. According to experimental results, the CMA can reduce the model recognition accuracy by more than 30%, demonstrating good attack performance, when the perturbation strength is 0.1 and the signal-to-noise ratio is 0 dB.
One of the most important applications of UAVs is person detection for security or rescue tasks. The goal of the proposed paper is to develop, experiment, and compare the performance of two new neural networks based o...
详细信息
ISBN:
(纸本)9798350369458;9798350369441
One of the most important applications of UAVs is person detection for security or rescue tasks. The goal of the proposed paper is to develop, experiment, and compare the performance of two new neural networks based on the transformer architecture, Detection Transformer and Vision Transformer. Two datasets were used, an own one for testing and COCO for learning. The results are promising to take into account the difficulties of person detection at a distance.
The deep learning enhanced channel estimation is presented for the OFDM communication in the internet of Vehicles (10V). Using the image enhancement and denoising, the channel responses at unknown positions are predic...
详细信息
This paper proposes a radio frequency signal identification method based on deep neural network. First, this article abstracts the radio frequency signal into a plane diagram and converts the radio frequency signal id...
详细信息
In today's interconnected world, ensuring robust security and privacy for IoT-enabled systems is essential due to the vulnerabilities introduced by the proliferation of IoT devices. Practical strategies, including...
详细信息
ISBN:
(纸本)9798350363999;9798350364002
In today's interconnected world, ensuring robust security and privacy for IoT-enabled systems is essential due to the vulnerabilities introduced by the proliferation of IoT devices. Practical strategies, including data hiding for secure transmission, are crucial for protecting sensitive information and critical infrastructure. Addressing challenges such as device authentication and data access control is vital. As digital images become the most commonly used media in IoT applications, many practitioners prefer using them to secure confidential data transmission through steganography. To address data privacy issues in IoT applications, this paper introduces a novel steganography algorithm designed to minimize image distortion while optimizing the use of embeddable pixels in the cover image. The proposed method converts secret data into a simplified format and sorts the binary representations of secret bits in ascending order, reducing the size of the secret data and significantly decreasing the payload. Experimental results demonstrate that this method outperforms existing techniques, achieving a Peak signal-to-Noise Ratio (PSNR) of 67.42 dB and an average Structural Similarity Index Measure (SSIM) of 0.99, ensuring a high-quality stego image. This advancement is believed to significantly enhance the security and privacy of IoT data transmissions.
We introduce the generative adversarial networks (GANs) into the application of the free-space laser communication, performing the laser-spot image restoration with the computer-vision method. After the training of ou...
详细信息
ISBN:
(纸本)9798350386288;9798350386271
We introduce the generative adversarial networks (GANs) into the application of the free-space laser communication, performing the laser-spot image restoration with the computer-vision method. After the training of our model is completed, the average Pearson correlation coefficient, average peak signal-to-noise ratio and mean-square-error of the generated restored laser spot image reach 0.964, 27.713dB and 0.003, respectively. According to the numerical results, the GAN method could effectively resist the interference of atmospheric turbulence, and accurately restore the amplitude distribution of the laser captured from the free-space channel.
Conventional video coding methods have been developed based on the human visual system (HVS). However, in recent years, video has occupied a huge portion of internet traffic, and the mount of video data for machine co...
详细信息
ISBN:
(纸本)9781728198354
Conventional video coding methods have been developed based on the human visual system (HVS). However, in recent years, video has occupied a huge portion of internet traffic, and the mount of video data for machine consumption has increased rapidly due to the progress of neural networks. This paper proposes a novel machine-attention-based video coding method for machines. Inspired by the saliency-driven research, we first extract attention regions, sensitively affecting the machine vision performance, from the object detection network. Subsequently, a maximum a posterior (MAP)-based bit allocation method is applied to assign more bits to the attention regions. Our proposed method helps to maintain high machine vision performance whereas reducing the bitrate. Experimental results show that our proposed method achieves up to 34.89% bjontegaard delta (BD)-rate reduction for the video dataset and up to 44.70% BD-rate reduction for the image dataset compared to state-of-the-art video coding technology.
暂无评论