This paper explores a novel multi-modal alternating learning paradigm pursuing a reconciliation between the exploitation of uni-modal features and the exploration of cross-modal interactions. This is motivated by the ...
This paper explores a novel multi-modal alternating learning paradigm pursuing a reconciliation between the exploitation of uni-modal features and the exploration of cross-modal interactions. This is motivated by the fact that current paradigms of multi-modal learning tend to explore multi-modal features simultaneously. The resulting gradient prohibits further exploitation of the features in the weak modality, leading to modality competition, where the dominant modality overpowers the learning process. To address this issue, we study the modality-alternating learning paradigm to achieve reconcilement. Specifically, we propose a new method called ReconBoost to update a fixed modality each time. Herein, the learning objective is dynamically adjusted with a reconcilement regularization against competition with the historical models. By choosing a KL-based reconcilement, we show that the proposed method resembles Friedman's Gradient-Boosting (GB) algorithm, where the updated learner can correct errors made by others and help enhance the overall performance. The major difference with the classic GB is that we only preserve the newest model for each modality to avoid overfitting caused by ensembling strong learners. Furthermore, we propose a memory consolidation scheme and a global rectification scheme to make this strategy more effective. Experiments over six multi-modal benchmarks speak to the efficacy of the method. We release the code at https://***/huacong/ReconBoost.
作者:
Lisi WeiLibo ZhaoXiaoli ZhangCollege of Computer Science and Technology
Jilin University China College of Artificial Intelligence and Big Data Hulunbuir University China and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University China College of Computer Science and Technology
Jilin University China and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University China
Due to the limitations of imaging sensors, obtaining a medical image that simultaneously captures both functional metabolic data and structural tissue details remains a significant challenge in clinical diagnosis. To ...
详细信息
Due to the limitations of imaging sensors, obtaining a medical image that simultaneously captures both functional metabolic data and structural tissue details remains a significant challenge in clinical diagnosis. To address this, Multimodal Medical Image Fusion (MMIF) has emerged as an effective technique for integrating complementary information from multimodal source images, such as CT, PET, and SPECT, which is critical for providing a comprehensive understanding of both anatomical and functional aspects of the human body. One of the key challenges in MMIF is how to exchange and aggregate this multimodal information. This paper rethinks MMIF by addressing the harmony of modality gaps and proposes a novel Modality-Aware Interaction Network (MAINet), which leverages cross-modal feature interaction and progressively fuses multiple features in graph space. Specifically, we introduce two key modules: the Cascade Modality Interaction (CMI) module and the Dual-Graph Learning (DGL) module. The CMI module, integrated within a multi-scale encoder with triple branches, facilitates complementary multimodal feature learning and provides beneficial feedback to enhance discriminative feature learning across modalities. In the decoding process, the DGL module aggregates hierarchical features in two distinct graph spaces, enabling global feature interactions. Moreover, the DGL module incorporates a bottom-up guidance mechanism, where deeper semantic features guide the learning of shallower detail features, thus improving the fusion process by enhancing both scale diversity and modality awareness for visual fidelity results. Experimental results on medical image datasets demonstrate the superiority of the proposed method over existing fusion approaches in both subjective and objective evaluations. We also validated the performance of the proposed method in applications such as infrared-visible image fusion and medical image segmentation.
Machine learning in the context of noise is a challenging but practical setting to plenty of real-world applications. Most of the previous approaches in this area focus on the pairwise relation (casual or correlationa...
详细信息
Deep learning (DL) architectures for super-resolution (SR) normally contain tremendous parameters, which has been regarded as the crucial advantage for obtaining satisfying performance. However, with the widespread us...
详细信息
Graph Pattern Matching (GPM) entails the identification of subgraphs within a larger graph structure that either precisely mirror or closely parallel a predefined pattern graph. Despite the fact that research on GPM i...
详细信息
Graph Pattern Matching (GPM) entails the identification of subgraphs within a larger graph structure that either precisely mirror or closely parallel a predefined pattern graph. Despite the fact that research on GPM in large-scale graph data has been largely centered on social network analysis or enhancing the precision and efficiency of matching algorithms for expeditious subgraph retrieval, there is a noticeable absence of studies committed to probing GPM in medical domains. To rectify this shortcoming and probe the potential of GPM in clinical contexts, particularly in aiding patients with the selection of optimal tumor treatment plans, this paper introduces the concept of probabilistic graph pattern matching specifically modified for the Tumor knowledge Graph (TKG). We propose a multi-constraint graph pattern matching algorithm, hereinafter designated as TKG-McGPM, customized for the Tumor knowledge Graph. Through experimental verification, we establish that TKG-McGPM can facilitate more efficient and informed decision-making in tumor treatment planning.
We propose a mechanism to generate a single intense circularly polarized attosecond x-ray pulse from the interaction of a circularly polarized relativistic few-cycle laser pulse with an ultrathin foil at normal incide...
详细信息
We propose a mechanism to generate a single intense circularly polarized attosecond x-ray pulse from the interaction of a circularly polarized relativistic few-cycle laser pulse with an ultrathin foil at normal incidence. Analytical modeling and particle-in-cell simulation demonstrate that a huge charge-separation field can be produced when all the electrons are displaced from the target by the incident laser, resulting in a high-quality relativistic electron mirror that propagates against the tail of the laser pulse. The latter is efficiently reflected as well as compressed into an attosecond pulse that is also circularly polarized.
The multilingual focused crawler system combines web content extraction with path configuration to make use of their advantages and achieve automatic collection of network information in multiple languages. Firstly, s...
The multilingual focused crawler system combines web content extraction with path configuration to make use of their advantages and achieve automatic collection of network information in multiple languages. Firstly, system selects foreign language keywords according to crawling webpage language and Chinese keywords, and uses initial link to obtain webpage information. Then, it uses path configuration information or web content extraction algorithm based on the distribution line block to get webpage content, and adopts rules or configuration information to acquire new links, published time and title. Next, keywords are used to filter irrelevant information. Finally, results are presented as a list. When users use focused crawler system, the webpage path information can be configured or not according to requirements, and the collected network resources can also be searched or filtered.
We report the weakly interacting massive particle (WIMP) dark matter search results using the first physics-run data of the PandaX-II 500 kg liquid xenon dual-phase time-projection chamber, operating at the China JinP...
详细信息
We report the weakly interacting massive particle (WIMP) dark matter search results using the first physics-run data of the PandaX-II 500 kg liquid xenon dual-phase time-projection chamber, operating at the China JinPing underground laboratory. No dark matter candidate is identified above background. In combination with the data set during the commissioning run, with a total exposure of 3.3×104 kg day, the most stringent limit to the spin-independent interaction between the ordinary and WIMP dark matter is set for a range of dark matter mass between 5 and 1000 GeV/c2. The best upper limit on the scattering cross section is found 2.5×10−46 cm2 for the WIMP mass 40 GeV/c2 at 90% confidence level.
暂无评论