Channel state information (CSI) acquisition is one of the major challenges for massive MIMO systems to enable high efficiency MIMO transmissions. Limited feedback schemes, which quantize CSI into limited number of bit...
详细信息
ISBN:
(纸本)9784907626488
Channel state information (CSI) acquisition is one of the major challenges for massive MIMO systems to enable high efficiency MIMO transmissions. Limited feedback schemes, which quantize CSI into limited number of bits and feed these bits back to the base station (BS), are widely used in real-life mobile networks for BSs to obtain downlink CSI. Practical limited feedback schemes usually have large feedback granularity and introduce severe nonlinear quantization error due to the constraint on feedback overhead. It is hard for legacy signal processing technologies to reconstruct CSI accurately for high efficiency downlink transmissions. Different from the legacy signal processing technologies, deep neural networks trained with big data have strong abilities to fit any nonlinear functions and learn the complex signal model behind the big data, which is a useful tool to solve problems that are nonlinear or hard to be described mathematically. Motivated by these observations, we consider a deep learning based approach for BS receivers to reconstruct CSI based on the limited feedback bits from user equipment (UE). The CSI reconstruction is similar with the image super-resolution which reconstructs a high resolution image from a low resolution one based on the structure information learned by the neural network itself. Performance evaluations demonstrate that the proposed method outperforms the legacy algorithms. The improvement on spectrum efficiency is about 25%. It can also reconstruct the CSI with lower density reference signal (RS) than that in current specification, e.g., the RS density can be 1/32 of current one. Therefore, the reference signal and feedback overhead can be reduced with our method. These results show the potential gain of applying deep learning to physical layer signal processing at the receiver side only, which has little air-interface impact and does not require much effort on enhancing network specifications. Therefore, it is feasible to be deploy
Unmanned aerial vehicles (UAVs) typically fly at low altitudes for capturing high-resolution images covering smaller areas. Since short flights also and high-resolution cameras lead to the generation of massive gigaby...
详细信息
The Federated Edge Artificial Intelligence (Edge AI) deploys AI applications on internet-of-Things (IoT) devices, addressing data privacy concerns in the real world. To achieve effective Federated Learning (FL), three...
详细信息
ISBN:
(数字)9798350349948
ISBN:
(纸本)9798350349955
The Federated Edge Artificial Intelligence (Edge AI) deploys AI applications on internet-of-Things (IoT) devices, addressing data privacy concerns in the real world. To achieve effective Federated Learning (FL), three challenges must be addressed: i) limited computing power on devices, ii) non-uniform impacts on devices, and iii) adaptability to changing network conditions. This study introduces a new algorithm called Adaptive Offloading Point (AOP), designed to accelerate local training on constrained devices. It decomposes deep neural network (DNN) layer blocks, enabling training on both client and server sides. The novelty of the proposed method lies in using a reinforcement learning-based Gaussian mixture model (GMM) clustering to dynamically determine DNN layer offloading, addressing nullability, computation uniformity, training time, and network bandwidth variation issues. Experimental results on real devices, using vision transformer (ViT) models with an identification image dataset, show that AOP's training time is significantly faster than that of previous baseline methods.
A new encoder-decoder steganography scheme, which is based on dense residual connections, is proposed in this paper to address the problem of low image quality in stego-image and message images generated by image steg...
详细信息
ISBN:
(数字)9798350372052
ISBN:
(纸本)9798350372069
A new encoder-decoder steganography scheme, which is based on dense residual connections, is proposed in this paper to address the problem of low image quality in stego-image and message images generated by image steganography schemes relying on encoder-decoder networks. Unlike existing end-to-end image steganography networks, this scheme does not necessitate image preprocessing and utilizes dense residual connections to facilitate the transportation of features from shallow networks to each layer of deep network structures. Consequently, the detailed information of the feature map is effectively preserved. Furthermore, channel and spatial attention modules are employed to filter features, thus enhancing the attention of the encoder-decoder towards complex texture areas in the image. The experimental results obtained from LFW, PASCAL-VOC12, and imageNet datasets demonstrate the effective enhancement of image quality and algorithm security facilitated by the proposed method. The stego-image and carrier images achieve an average peak signal-to-noise ratio (PSNR) of up to 36.2dB and a structural similarity (SSIM) of 0.98, respectively.
Time-domain approaches have shown the potential to improve the performance of speaker verification, but still predominant approaches utilize hand-crafted features such as the mel filterbank energies. Although these fe...
详细信息
Time-domain approaches have shown the potential to improve the performance of speaker verification, but still predominant approaches utilize hand-crafted features such as the mel filterbank energies. Although these features are based on speech perception models and exhibited impressive performances, the fixed frame size does not allow good temporal and spectral resolutions at the same time and there is information loss when taking the magnitude spectrum and during frequency rescaling. In this paper, we propose to incorporate multi-resolution time-domain information into the ECAPA-TDNN speaker verification system. We construct a multi-resolution encoder to extract multiple features in different temporal resolutions, and let the extracted features drive the adapter modules. Experimental results showed that the proposed method outperformed other recently proposed approaches when the input length was 2 seconds or shorter for the VoxCeleb dataset. The proposed approach also showed superior performance on the Google Speech Commands dataset v2.
The upload and download of image data is a huge flow of traffic among internet, which means sensitive information are more vulnerable to be stolen and to be attacked by attackers. Therefore, it is urgently needed an i...
详细信息
In this paper, we propose a novel signal detection method in the framework of matrix information geometry and apply it to target detection in complex clutter. The sample data is first assumed to be constructed as a hi...
详细信息
The most important work of retinal image processing is blood vessel segmentation and optic disc location. The fovea in the retina image, as well as the bright spots and blood oozing caused by the disease will hinder t...
详细信息
With the wide application of virtual reality technology, computer graphics and multimedia technology in various fields, people are paying more and more attention to the research on modeling and rendering methods of hi...
详细信息
High-quality annotation of fine-grained visual categorization requires extensive professional knowledge, which is time-consuming and laborious. Therefore, learning fine-grained visual representations from a large numb...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
High-quality annotation of fine-grained visual categorization requires extensive professional knowledge, which is time-consuming and laborious. Therefore, learning fine-grained visual representations from a large number of unlabeled images through self-supervised learning has become a popular alternative solution recently. However, the existing self-supervised learning methods are not effective in fine-grained visual categorization since many features helpful for optimizing self-supervised learning objectives are unsuited to characterize the subtle differences in fine-grained viusal recognition. To deal with this issue, we propose a mutual learning network to enhance the model’s attention towards discriminative semantic features. The key idea is to consider semantic consistency between different augmented views within same image and capture discriminative semantic information. For semantic consistency, our research demonstrates that cross-view attention module between different augmented views can guide our model to capture similar semantic features. based on this, we further build a GradCAM-guided multi-dimension loss that utilize GradCAM to control our model from different dimensions to discriminative semantic information that are beneficial to fine-grained visual recognition. Experiments on CUB-200-2011, Stanford Cars and Aircrafts datasets demonstrate that the mutual learning network outperforms previous self-supervised learning methods in linear probing and image retrieval.
暂无评论