In this paper, a semantic communication framework for image transmission is investigated. In the framework, a server transmits image data to a set of users utilizing semantic communication techniques, which enable the...
详细信息
ISBN:
(纸本)9781665435413
In this paper, a semantic communication framework for image transmission is investigated. In the framework, a server transmits image data to a set of users utilizing semantic communication techniques, which enable the server to transmit only the semantic information that accurately captures the meaning of an image. To evaluate the performance of the studied semantic communication system, we propose a multimodal metric called image-to-graph semantic similarity (ISS). The significance of this new metric is that it can measure the correlation of the meaning between semantic information and the original image. To meet the ISS requirement of each user, the server must jointly determine the semantic information to be transmitted and the resource blocks (RBs) used for semantic information transmission. We formulate this problem as an optimization problem whose goal is to minimize the average transmission latency while reaching the ISS requirement. To solve this problem, we propose a model-based actor critic deep reinforcement learning (DRL) algorithm. Compared to traditional actor critic DRL, in the proposed algorithm, we design a novel value function to improve the action exploration thus improving the probability of finding an optimal solution. Simulation results show that the proposed method can reduce the transmission delay by 16.4% and improves the convergence speed by up to 50% compared to the traditional actor critic DRL.
Existing automatic sleep staging algorithms rely on accurately labeled data. However, due to the subjectivity of sleep experts, accurate labels must be obtained through joint labeling by multiple experts, which result...
详细信息
Existing automatic sleep staging algorithms rely on accurately labeled data. However, due to the subjectivity of sleep experts, accurate labels must be obtained through joint labeling by multiple experts, which results in high time and labor costs. In this work, we treat labels mislabeled by a single expert as noisy labels and first propose SE-ASS, an automatic sleep staging learning framework based on single-expert annotated data. Since multiple models tend to produce inconsistent predictions for instances with incorrect labels during training, we use two networks with the same structure but different initializations and regularize them with a prediction consistency loss to prevent overfitting to noisy labels. Furthermore, we use a contrastive loss between models to enhance the exploration of feature representations without relying on potentially noisy labels. Our results on two publicly available datasets show that SE-ASS can effectively improve the performance of automatic sleep staging models trained on single-expert annotated datasets.
An emerging fluid antenna system (FAS) brings a new dimension, i.e., the antenna positions, to deal with the deep fading, but simultaneously introduces challenges related to the transmit design. This paper proposes an...
详细信息
Multi-view clustering based on non-negative matrix factorization (NMFMvC) is a well-known method for handling high-dimensional multi-view data. To satisfy the non-negativity constraint of the matrix, NMFMvC is usually...
详细信息
In this paper, we consider the design of an energy efficient collaborative federated learning (CFL) methodology where devices exchange their local FL parameters with a subset of their neighbors without reliance on a p...
In this paper, we consider the design of an energy efficient collaborative federated learning (CFL) methodology where devices exchange their local FL parameters with a subset of their neighbors without reliance on a parameter server. In the considered model, mobile devices implement the designed CFL to train their local FL models using their own datasets over a realistic wireless network. Due to the limited wireless resources and user movements, each device may not be able to transmit its FL parameters with all neighboring devices. Therefore, each device must select a subset of devices to share its FL parameters and optimize the transmit power. This problem is formulated as an optimization problem, whose goal is to minimize CFL training energy consumption while satisfying the delay and CFL training loss requirements. To solve this problem, a two-stage solution is proposed. At the first stage, a graph neural network (GNN) based algorithm is proposed, which enables each device to individually determine the subset of devices to transmit FL parameters using its neighboring devices' location and connection information. Compared to standard iterative algorithms that need to iteratively optimize device connections and transmit power, the proposed GNN based method can directly obtain the optimal device connections without iterative optimization. Given the optimal device connections, at the second stage, each device can directly obtain the optimal transmit power. Simulation results show that the proposed algorithm can decrease energy consumption by up to 46% compared to the algorithm where each device will directly connect to its first and second nearest neighbors.
Object pose estimation is a fundamental computer vision problem with broad applications in augmented reality and robotics. Over the past decade, deep learning models, due to their superior accuracy and robustness, hav...
详细信息
This study proposed a lightweight and secure audio steganography system for hiding text messages during transmission over the Internet to address the computational cost exaggeration, and Insufficient levels of securit...
详细信息
The Internet of Vehicles (IoV) industry has developed rapidly in recent years. However, the information security of IoV needs more attention. The use of cross-layer secure transmission technology can improve the secur...
The Internet of Vehicles (IoV) industry has developed rapidly in recent years. However, the information security of IoV needs more attention. The use of cross-layer secure transmission technology can improve the security of IoV communication, but the existing cross-layer schemes have some shortcomings. To this end, we propose a packet encoding scheme based on encrypted Raptor codes to improve the secure capacity of IoV communication by utilizing fountain codes and physical layer Low-density parity-check (LDPC) codes. Specifically, we choose Raptor codes which combine LDPC codes and fountain codes for secure encoding. With a sparser degree distribution, Raptor codes make decoding faster and more accurate at the legitimate receiver. In the transmission, the transmitter encrypts and sends the coding control information corresponding to the packets received by the legitimate receiver, rather than sending the generating matrix directly. We found that confidentiality can be improved by this encrypting. The simulation results show that the proposed scheme has higher security than the comparison schemes.
Multimodal medical image fusion technology provides more comprehensive and accurate image support for clinical diagnosis and treatment by integrating complementary information from different imaging modalities. Aiming...
详细信息
ISBN:
(数字)9798331513054
ISBN:
(纸本)9798331513061
Multimodal medical image fusion technology provides more comprehensive and accurate image support for clinical diagnosis and treatment by integrating complementary information from different imaging modalities. Aiming at the problem that existing methods are still insufficient in detail feature extraction and inter-modal information fusion, this paper proposes a multimodal medical image fusion method combined with an adaptive attention mechanism. First, we design the Grouped Receptive Field Attentional Convolution (GRFAConv) to solve the problem of insufficient detail feature extraction capability. With the multi-head receptive field adaptive weighting strategy of grouped convolution, the range and weight of the receptive field of the convolution kernel can be adaptively adjusted according to the different demands of local and global features of the image to improve the effect of detail retention. Second, for the problem of information fusion between different modalities, we introduce an improved CBAM attention module in the feature fusion process, which adaptively selects and enhances the features in the key regions through the channel attention and spatial attention mechanisms, which greatly improves the clarity of the fused image details and the accuracy of the information expression in the key regions. Furthermore, experimental results on several medical image datasets show that the algorithm proposed in this paper can generate relatively high-quality fused images. It not only enriches the detailed features of the image, but also achieves significant advantages in several evaluation metrics.
High-resolution point clouds (HRPCD) anomaly detection (AD) plays a critical role in precision machining and high-end equipment manufacturing. Despite considerable 3D-AD methods that have been proposed recently, they ...
详细信息
暂无评论