Privacy violations are common in our technology-driven world, where almost everyone interacts with internet-connected electronic devices. To address this, cryptography has emerged, concealing data during transmission ...
详细信息
We propose content-aware supervision (CAS) techniques for diffusion-based restoration of an extremely compressed background for video coding for machines (VCM). First, we develop a CAS block to exploit prior informati...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
We propose content-aware supervision (CAS) techniques for diffusion-based restoration of an extremely compressed background for video coding for machines (VCM). First, we develop a CAS block to exploit prior information in an input image to reconstruct the noisy image, which is used as the input for the pretrained diffusion model. Then, we construct a refinement block to guide the pretrained diffusion model at each diffusion step by incorporating a degradation model and correction gradient estimation. Experimental results demonstrate the proposed algorithm outperforms state-of-the-art algorithms.
This paper presents a new, compact, single-device array antenna for Ka-band internet applications. The proposed model operates without routing, streamlining deployment and usage. The antenna's features are enginee...
详细信息
In the field of image editing, Null-text Inversion (NTI) enables fine-grained editing while preserving the structure of the original image by optimizing null embeddings during the DDIM sampling process. However, the N...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
In the field of image editing, Null-text Inversion (NTI) enables fine-grained editing while preserving the structure of the original image by optimizing null embeddings during the DDIM sampling process. However, the NTI process is time-consuming, taking more than two minutes per image. To address this, we introduce an innovative method that maintains the principles of the NTI while accelerating the image editing process. We propose the WaveOpt-Estimator, which determines the text optimization endpoint based on frequency characteristics. Utilizing wavelet transform analysis to identify the image's frequency characteristics, we can limit text optimization to specific timesteps during the DDIM sampling process. By adopting the Negative-Prompt Inversion (NPI) concept, a target prompt representing the original image serves as the initial text value for optimization. This approach maintains performance comparable to NTI while reducing the average editing time by over 80% compared to the NTI method. Our method presents a promising approach for efficient, high-quality image editing based on diffusion models.
This paper studies the problem of the lightweight image semantic communication system that is deployed on internet of Things (IoT) devices. In the considered system model, devices must use semantic communication techn...
详细信息
ISBN:
(纸本)9798350303582;9798350303599
This paper studies the problem of the lightweight image semantic communication system that is deployed on internet of Things (IoT) devices. In the considered system model, devices must use semantic communication techniques to support user behavior recognition in ultimate video service with high data transmission efficiency. However, it is computationally expensive for IoT devices to deploy semantic codecs due to the complex calculation processes of deep learning (DL) based codec training and inference. To make it affordable for IoT devices to deploy semantic communication systems, we propose an attention-based UNet enabled lightweight image semantic communication (LSSC) system, which achieves low computational complexity and small model size. In particular, we first let the LSSC system train the codec at the edge server to reduce the training computation load on IoT devices. Then, we introduce the convolutional block attention module (CBAM) to extract the image semantic features and decrease the number of downsampling layers thus reducing the floating-point operations (FLOPs). Finally, we experimentally adjust the structure of the codec and find out the optimal number of downsampling layers. Simulation results show that the proposed LSSC system can reduce the semantic codec FLOPs by 14%, and reduce the model size by 55%, with a sacrifice of 3% accuracy, compared to the baseline. Moreover, the proposed scheme can achieve a higher transmission accuracy than the traditional communication scheme in the low channel signal-to-noise (SNR) region.
In the wake of the accelerated advancement of internet of Things(IoT) technology, a significant volume of multimedia information is emerging as the dominant component of IoT applications. However, this information als...
详细信息
Semantic communication in wireless image transmission leverages the meaning embedded in the image data, aiming to compress, transmit, and reconstruct images based on their semantic content rather than purely pixel dat...
详细信息
ISBN:
(纸本)9798350363999;9798350364002
Semantic communication in wireless image transmission leverages the meaning embedded in the image data, aiming to compress, transmit, and reconstruct images based on their semantic content rather than purely pixel data. This paradigm shift allows more efficient utilization of bandwidth and computational resources, focusing on extracting key features and contextual information that is critical for ensuring that the essential content of the image is preserved and accurately conveyed. In this study, we present a novel Stable Diffusion-based semantic communication (SDSC) framework that demonstrates high performance, characterized by an elevated bandwidth compression ratio (BCR) and robust noise tolerance achieved by diffusion mechanism integrating supplementary prompts. Our approach utilizes pre-trained modules of a Variational autoencoder (VAE) and a modified U-shaped network (UNet) to enable robust semantic encoding, decoding, and effective channel denoising. This scheme significantly enhances the system's ability to preserve data integrity and meaning in noisy environments. By introducing additional context-aware prompts during transmission, we improve the accuracy of received information and mitigate the adverse effects of interference and noise. Extensive simulations show that our framework outperforms previous innovative models, demonstrating superior communication fidelity and efficiency under various challenging conditions.
Hyperspectral super-resolution involves combining low-resolution hyperspectral images with high-resolution multispectral images to produce a high-resolution hyperspectral image. Recently, although many methods for hyp...
详细信息
Acoustic signal processing holds significant promise for real-time fish feeding intensity estimation in aquaculture. Unlike traditional methods reliant on visual cues or sensor data, acoustic analysis provides valuabl...
详细信息
In order to solve the problem of low precision and low stability of deep learning network for industrial equipment fault recognition under strong background noise, a method of equipment fault image recognition based o...
详细信息
暂无评论