The Industrial Internet of Things (IIoT) leverages Federated Learning (FL) for distributed model training while preserving data privacy, and meta-computing enhances FL by optimizing and integrating distributed computi...
详细信息
Graph neural networks (GNNs) have gained significant attention and have been applied in various domain tasks. Currently, numerous pooling approaches have been proposed to aggregate node features and obtain node embedd...
详细信息
A high-dimensional and incomplete (HDI) matrix can describe the complex interactions among numerous nodes in various bigdata-related applications. A stochastic gradient descent (SGD)-based latent factor analysis (LFA...
详细信息
Studies have shown that learning personal stories could help provide individualized eldercare services. However, personal stories are often disordered because of the scattered collection, including informal interviews...
详细信息
Deep reinforcement learning (DRL) has been widely used in many important tasks of communication networks. In order to improve the perception ability of DRL on the network, some studies have combined graph neural netwo...
详细信息
In the domain of Multimodal Relation Extraction (MRE), we present the $\color{Red}{\text{W}}$atcher-$\color{Red}{\text{M}}$ediated $\color{Red}{\text{A}}$ttention $\color{Red}{\text{J}}$oint $\color{Red}{\text{L}}$ear...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In the domain of Multimodal Relation Extraction (MRE), we present the $\color{Red}{\text{W}}$atcher-$\color{Red}{\text{M}}$ediated $\color{Red}{\text{A}}$ttention $\color{Red}{\text{J}}$oint $\color{Red}{\text{L}}$earning Model ($\color{Red}{\text{WMAJL}}$), a novel approach addressing the challenges of modality alignment noise, cross-modal fusion disparity, preservation of textual relative position information, and the distinctiveness of classification labels. WMAJL employs an integrative framework leveraging contrastive learning and variational autoencoder constraints to mitigate modality alignment noise by prioritizing relevant semantic data and effectively reducing extraneous noise that does not contribute to the task. The model’s innovative architecture includes a mediator watcher, which facilitates enhanced cross-modal fusion by enabling nuanced information exchange between textual and visual modalities while preserving the unique characteristics of each modality. Additionally, the design of auxiliary tasks, such as Named Entity Recognition (NER), and output supervision constructs loss functions that preserve relative position information, ensuring a precise depiction of entity relationships throughout the multilayer encoding processes. A key differentiator of WMAJL is its label-centric self-information loss technique, inspired by InfoNCE, which trains the model to cluster similar relation labels in semantically coherent areas, thereby optimizing classification label uniqueness by discerning subtle differences among relation types. The synergistic application of these strategies has led to a significant enhancement of WMAJL’s performance, as evidenced by its state-of-the-art F1 score of $\color{Red}{84.93\%}$ on the MNRE dataset. This achievement surpasses existing benchmarks and sets a new standard for multimodal knowledge extraction, underscoring WMAJL’s potential to revolutionize the MRE landscape.
Hyperspectral Image (HSI) cross-scene classification is a challenging task in remote sensing, particularly when real-time processing of Target Domain (TD) HSI is required, and data cannot be reused for training. While...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Hyperspectral Image (HSI) cross-scene classification is a challenging task in remote sensing, particularly when real-time processing of Target Domain (TD) HSI is required, and data cannot be reused for training. While deep learning methods have shown promising results, the generalization ability of HSI representations remains limited, mainly due to class label imbalance. This paper introduces a dual-stage learning framework based on transfer learning to enhance classification accuracy in the TD. The framework includes a self-supervised learning stage and a supervised fine-tuning stage. The self-supervised stage focuses on learning robust representations by leveraging inherent structures within HSI data, while the fine-tuning stage uses training labels to extract semantic information. A masked diffusion model predicts masked tokens from unmasked ones, capturing both high-level structures and fine details in HSI data. An efficient spatiospectral Transformer, which removes self-attention from the decoder, is proposed to enhance the self-supervised process. This design allows mask tokens to obtain information from visible tokens without interacting with each other, reducing sequence length and computational costs. By decoding each mask token conditionally independently, only a subset of masked tokens is processed. Extensive experiments on two public HSI datasets demonstrate that the proposed method outperforms state-of-the-art techniques.
Neoadjuvant chemoradiotherapy (nCRT) is the stan-dard treatment for locally advanced rectal cancer (LARC). With the development of artificial intelligence, an increasing number of studies have begun to explore its app...
详细信息
ISBN:
(数字)9798350337488
ISBN:
(纸本)9798350337495
Neoadjuvant chemoradiotherapy (nCRT) is the stan-dard treatment for locally advanced rectal cancer (LARC). With the development of artificial intelligence, an increasing number of studies have begun to explore its application in cancer treatment prediction. However, the prior methods exhibit considerable variability even with slight modifications to the input data, which could potentially undermine the reliability of the results. In this paper, we proposed RP-Net, a novel multi-modal fusion-based framework that combines feature information from magnetic resonance imaging (MRI) and whole slide images (WSI), establishing a relationship to map the therapeutic effectiveness of nCRT for LARC. We investigated the relationship of the tumour region and its periphery tissues, and demonstrated the validity of the proposed framework that involving 11 different combinations of modalities. The experimental results revealed that it has achieved higher prediction accuracy compared to the four intra-categories single-modal combinations and outperformed the two intra-categories multi-modal combinations. When compared to the other four inter-categories multi-modal combinations, the fusion features get accuracy of 2 % ~ 6% improvement respectively.
Mixed polarity Reed-Muller (MPRM) circuit area optimization has become a research hotspot in the field of integrated circuit design. It is a combinatorial optimization, aiming at finding the MPRM expression with the l...
详细信息
Given the huge toll caused by natural disasters, it is critically important to develop an effective disaster management and emergency response technique. In this article, we investigate relationships between typhoon-r...
详细信息
暂无评论