This paper considers lossless image compression and presents a learned compression system that can achieve state-of-the-art lossless compression performance but uses only 59K parameters, which is one or two order of m...
详细信息
This paper considers lossless image compression and presents a learned compression system that can achieve state-of-the-art lossless compression performance but uses only 59K parameters, which is one or two order of magnitudes less than other learned systems proposed recently in the literature. The explored system is based on a learned pixel-by-pixel lossless image compression method, where each pixel's probability distribution parameters are obtained by processing the pixel's causal neighborhood (i.e. previously encoded/decoded pixels) with a simple neural network comprising 59K parameters. This causality causes the decoder to operate sequentially, i.e. the neural network has to be evaluated for each pixel sequentially, which increases decoding time significantly with common GPU software and hardware. To reduce the decoding time, parallel decoding algorithms are proposed and implemented. The obtained lossless image compression system is compared to traditional and learned systems in the literature in terms of compression performance, encoding-decoding times and computational complexity.
Recently, due to the rapid development of generative AI technologies, the use of AI-generated images has increased significantly, making the distinction between real and fake images crucial. Generative images may be u...
详细信息
ISBN:
(纸本)9783031664304;9783031664311
Recently, due to the rapid development of generative AI technologies, the use of AI-generated images has increased significantly, making the distinction between real and fake images crucial. Generative images may be used in various ways such as data training and fast image generation, but a potential for misuse, such as in deep fake or spreading false information, still exists. This study explores a novel model using the architecture of Swin-Transformer to distinguish between fake and real images generated based on CNN (Convolutional neuralnetworks) and GAN (Generative Adversarial networks). The Swin-Transformer, a successor model of Vision in Transformer (ViT), applies the structure of the Transformer, which has shown outstanding performance in natural language processing, to the field of images and demonstrates excellent pixel-level segmentation performance. Real and fake images require detailed pixel-level analysis, in which the Swin-Transformer exhibits higher accuracy. Improving the performance of distinguishing between real and fake images is expected to set limits on indiscreet image generation, bringing further effects such as preventing the indiscriminate use of AI images through program-based discrimination/legal sanctions.
General Adversarial networks (GANs) have emerged as a powerful framework for generating reliable and transformative synthetic data in areas such as image generation, image and text synthesis, and data augmentation. Th...
详细信息
The surveillance and management of cargo fleets is a crucial objective of intelligent transportation systems. Load, especially overload, has a destructive effect on roads and bridges, and monitoring it can increase th...
详细信息
The surveillance and management of cargo fleets is a crucial objective of intelligent transportation systems. Load, especially overload, has a destructive effect on roads and bridges, and monitoring it can increase the life of road surface and its structure. For low-end hardware with lack of CPU power and no GPU support, this paper presents a rapid method to detect whether heavy vehicles have loads or not;then it proposes a fast method for classifying load types to distinguish soil and construction waste from other miscellaneous loads for heavy weight vehicles. This paper applies a method for classifying cargo types using imageprocessing and texture image classification. This method extracts features for statistical analysis of texture images based on gray-level co-occurrence matrices and local binary patterns. The classification is carried out by support vector machine, k-nearest neighborhood, K-mean, artificialneuralnetworks and random forest classifiers. A large number of positive and negative patterns have been used to train these classifiers. We compare the performance of proposed extracted features and classifiers. The simulation results demonstrate that soil and construction waste can be identified from other miscellaneous loads effective in real-time implementation.
Hyperspectral image (HSI) classification is valuable in remote sensing due to its rich spectral and spatial information. In the last decade, deep learning methods, especially Convolutional neuralnetworks (CNNs), have...
详细信息
ISBN:
(纸本)9798350350920
Hyperspectral image (HSI) classification is valuable in remote sensing due to its rich spectral and spatial information. In the last decade, deep learning methods, especially Convolutional neuralnetworks (CNNs), have revolutionized HSI classification by extracting intangible semantic features and maintaining the spatial structure during feature extraction. However, the efficacy of these techniques can be constrained by the limited availability of labeled samples in HSI data. To address the issue of small-sample HSI classification, a Lightweight Multiscale Feature Fusion Network (L-MFFN) is introduced. The Multiscale Feature Extraction Module (MFEM) and the enhanced Spectral-Spatial Attention Module (SSAM) are designed and combined in L-MFFN, optimizing the use of deep and shallow features. This integration improves the extraction and fusion of multiscale spectral-spatial features, enhancing classification performance. The proposed model demonstrates state-of-the-art performance across two HSI datasets and stands out in situations with limited labeled samples, highlighting its capability to effectively tackle the challenge of small-sample HSI classification.
This work offers a comprehensive investigation of sentiment analysis in social media communication through the integration of deep learning techniques with a natural language processing (NLP) methodology. The goal of ...
详细信息
The proceedings contain 19 papers. The topics discussed include: bacterial colony counter using different imageprocessing algorithms;detection of facial expressions based on three feature points using image processin...
ISBN:
(纸本)9798350386363
The proceedings contain 19 papers. The topics discussed include: bacterial colony counter using different imageprocessing algorithms;detection of facial expressions based on three feature points using imageprocessing with artificialneuralnetworks;YOLO-based helmet detection system for safety compliance in oil and gas industry;virtual sample generation using conditional adversarial network with latent spaces as noise inputs;IoT integrated conveyor centralized system;weighted subgraph knowledge distillation for graph model compression;bacterial colony counter using different imageprocessing algorithms;detection of facial expressions based on three feature points using imageprocessing with artificialneuralnetworks;and verifying the effectiveness of using virtual characters for the promotion of a university department.
Plant diseases pose significant challenges to agricultural productivity, impacting both the quality and quantity of crop yields. Early detection and effective management of these diseases are essential for mitigating ...
详细信息
In augmented reality(AR) applications, it is a challenging task to generate virtual object shadows while maintaining the precision and consistency of virtual and real areas. To achieve the above target, we propose a l...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
In augmented reality(AR) applications, it is a challenging task to generate virtual object shadows while maintaining the precision and consistency of virtual and real areas. To achieve the above target, we propose a learnable weighted recurrent generative adversarial network(LRGAN) for end-to-end shadow generation. Without any additional computational overhead, LRGAN only needs to analyze the background context to create a bridge between the target shadows and the background. Our model incorporates multiple progressive steps to recurrently compute the precise reference masks, based on which a fine-grained shadow generation module generates the shadows. A learnable weighted fusion module, which can normalize pixel values to deal with pixel overflow, fuses the generated shadows with the original image. In addition, we adopt the combined method of module training and the whole model training. Experimental results show that our proposed LRGAN not only improves the plausibility of shadow location and shape but also achieves color harmony in the shadow areas. In the absence of other prior knowledge or post-processing, it outperforms the State-of-the-Art end-to-end methods.
The growing amount of data collected by Earth Observation (EO) satellites requires new processing procedures able to manage huge quantity of information. artificial intelligence (AI) and deep learning (DL) can provide...
详细信息
The growing amount of data collected by Earth Observation (EO) satellites requires new processing procedures able to manage huge quantity of information. artificial intelligence (AI) and deep learning (DL) can provide advanced information also because of their ability to extract valuable information from complex data. Thanks to specific hardware platforms, these algorithms can be used also in space, opening the possibility for new procedures for intelligent data processing. The European Space Agency phi-Sat-2 mission was designed with the purpose of demonstrating the benefits of using AI in space by running AI-based applications on-board a CubeSat. We present here the convolutional autoencoder-based algorithm developed for on-board lossy image compression of the phi-Sat-2 mission and provide a first benchmark addressing a real space mission and a new image compression end-to-end architecture based on AI. image compression is a crucial application that allows to save transmission bandwidth and storage. In fact, images acquired by the sensor can be compressed on-board and sent to the ground where they are reconstructed. DL algorithms have already been successfully applied for image compression however performance degradation may occur in the context of a representative on-board environment. Therefore, besides analyzing the results for the local hardware environment, this article investigates the performance variation for the on-board setting. An additional piece of innovation is the introduction of an applicative metric for the evaluation of the compression to assess the applicability of the reconstructed images for other tasks. Such metric completes those more traditional based on the original-reconstructed image similarity.
暂无评论