Traditional supervised learning methods achieve remarkable performance in high-resolution remote sensing image retrieval, but are limited by the dependence on large-scale annotated images. Contrastive learning can lev...
详细信息
Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from i...
Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from individual views of 3D shapes, with a focus on various strategies for fusing these extracted view features. However, this approach neglects interview correlations and potential redundancies among different views. In this study, we introduce $$\hbox {C}^2$$ DFL, which consists of two primary modules: cross-view discriminative feature extraction (CV-DFE) and cross-layer discriminative feature fusion (CL-DFF). CV-DFE integrates discriminative features by merging inputs from multiple views, mitigating limitations associated with isolated feature extraction. CL-DFF dynamically selects key tokens using a transformer model to interactively fuse discriminative features from various levels. Extensive experiments conducted on three categories of the FG3D dataset demonstrate the exceptional efficacy of $$\hbox {C}^2$$ DFL in capturing and integrating discriminative features of 3D shapes. The proposed method achieves state-of-the-art accuracy in fine-grained 3D shape classification (FGSC).
Fine-grained 3D shape classification (FGSC) remains challenging due to the difficulty of adaptively capturing global structure differences and subtle inter-class distinctions. This paper directly extends Vision Transf...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Fine-grained 3D shape classification (FGSC) remains challenging due to the difficulty of adaptively capturing global structure differences and subtle inter-class distinctions. This paper directly extends Vision Transformer (ViT) to FGSC, proposing a pure Transformer network FG3DFormer that fully leverages ViT’s global correlation and local attention abilities. FG3Dformer comprises the Hierarchical Feature Extraction (HFE) and the Hierarchical Feature Refinement (HFR), interconnected through the Adaptive View Region Selection (AVRS). Firstly, the HFE comprehensively evaluates the significance of intra-view patches and views driven by inter-view and intraview attention. Then, the AVRS adaptively selects crucial patch Tokens from different views to serve as sources of subtle local features. Finally, the HFR refines the 3D shape descriptor, capturing more discriminative global and subtle local features by leveraging both the view and selected crucial patch Tokens. Extensive experiments on FG3D and ModelNet40 demonstrate the superiority of FG3Dformer in FGSC and meta-category 3D shape classification tasks.
Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from i...
详细信息
Fine-grained 3D shape classification (FGSC) has garnered significant attention recently and has made notable advancements. However, due to high inter-class similarity and intra-class diversity, it is still a challenge...
详细信息
Recently, vision transformer based multimodal learning methods have been proposed to improve the robustness of face anti-spoofing (FAS) systems. However, multimodal face data collected from the real world is often imp...
详细信息
Purpose: Accurate airway anatomical labeling is crucial for clinicians to identify and navigate complex bronchial structures during bronchoscopy. Automatic airway labeling is challenging due to significant anatomical ...
详细信息
Protein-Protein Interaction (PPI) provides important insights into the metabolic mechanisms of different biological processes. Although PPIs in some organisms have been investigated systematically, PPIs in the ocean a...
详细信息
ISBN:
(纸本)9798400712203
Protein-Protein Interaction (PPI) provides important insights into the metabolic mechanisms of different biological processes. Although PPIs in some organisms have been investigated systematically, PPIs in the ocean archaea remain largely unexplored. But such species have special investigation value since their adaptation to extreme living conditions may generate unique PPIs. In this paper, we aim to characterize and predict PPIs in ocean archaea to advance understanding of their metabolic networks. First, we collect all ocean archaea PPIs with high confidence from STRING database and analyze the PPI network features, including centrality and enrichment analysis. The functional enrichment results of the largest connecting subgraph in the PPI network show most PPIs in our constructed dataset is related to the translation and transcription processes. Then, we generate an equal number of negative PPI pairs, whose members have either different subcellular locations or GO terms. We also use the generated dataset to test the performance of three pretraining methods and their ensemble methods in the binary PPI prediction task. Our results suggest the ensemble methods could be applied to further improve models’ performance. Fine-tuned models trained on the ocean archaea dataset are expected to predict the other ocean archaea PPIs that are not included in the STRING database and get more understanding about the ocean archaea PPI universe.
We describe NLSExplorer, an interpretable approach for nuclear localization signal (NLS) prediction. By utilizing the extracted information on nuclear-specific sites from the protein language model to assist in NLS de...
详细信息
Lightweight image super-resolution aims to reconstruct high-resolution images from low-resolution images using low computational costs. However, existing methods result in the loss of middle-layer features due to acti...
详细信息
暂无评论