检索结果-内蒙古大学图书馆

Automatic Weight Allocation: optimizing remote sensing image retrieval from contrastive learning perspective

Multimedia Systems 2025年第3期31卷 1-19页

作者： Wang, Sijia Ge, Yun Liu, Qiyang Zeng, Yan School of Software Nanchang Hangkong University Jiangxi Nanchang330000 China Jiangxi Province Key Laboratory of Image Processing and Pattern Recognition Jiangxi Nanchang330063 China

Traditional supervised learning methods achieve remarkable performance in high-resolution remote sensing image retrieval, but are limited by the dependence on large-scale annotated images. Contrastive learning can leverage unlabeled images to learn powerful visual features, demonstrating its potential in many unsupervised tasks. Moreover, hash algorithms show significant potential in the field of image retrieval with their advantages in efficiency and storage. Therefore, we propose the Contrastive Hashing Framework based on Automatic Weight Allocation. The framework employs a two-stage training strategy. In the feature learning stage, we propose the Automatic Weighted Contrastive Loss (AWCLoss). It incorporates Gaussian weighting and dynamic adjustment strategies to improve loss functions, enabling them to focus on the distinctiveness and importance of samples. Gaussian weighting assigns different weight values based on the similarity of sample pairs, enhancing the learning of critical sample pairs. Meanwhile, the dynamic adjustment strategy sets a threshold to identify hard negative samples and then adjusts the weight values to weaken the model from being disturbed by hard negative samples. In the hashing learning stage, a hashing layer is added to the end of the network, which converts high-dimensional representations into hash codes. The Quantization loss is introduced to learn the hash codes so that the semantic similarity structure between data can be preserved in hamming space. Additionally, the AWCLoss is utilized to enhance the discriminative power of the hash codes. Extensive experiments on three remotely sensed datasets UCM, AID and NWPU-RESISC45 have demonstrated the significant superiority of our approach in remote sensing image retrieval. Our source code is available at https://***/WANGSJ77/AWCH. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

DFL: cross-view cross-layer discriminative feature learning for fine-grained 3D shape classification

引用

Neural Computing and Applications 2025年 1-22页

作者： Jiang, Jinzhe Bai, Jing Ma, Xiangyu The School of Computer Science and Engineering North Minzu University Yinchuan China The Key Laboratory of Images Processing and Pattern Recognition Laboratory North Minzu University Yinchuan China

Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from individual views of 3D shapes, with a focus on various strategies for fusing these extracted view features. However, this approach neglects interview correlations and potential redundancies among different views. In this study, we introduce $$\hbox {C}^2$$ DFL, which consists of two primary modules: cross-view discriminative feature extraction (CV-DFE) and cross-layer discriminative feature fusion (CL-DFF). CV-DFE integrates discriminative features by merging inputs from multiple views, mitigating limitations associated with isolated feature extraction. CL-DFF dynamically selects key tokens using a transformer model to interactively fuse discriminative features from various levels. Extensive experiments conducted on three categories of the FG3D dataset demonstrate the exceptional efficacy of $$\hbox {C}^2$$ DFL in capturing and integrating discriminative features of 3D shapes. The proposed method achieves state-of-the-art accuracy in fine-grained 3D shape classification (FGSC).

关键词：

来源：评论

学校读者我要写书评

暂无评论

FG3DFormer: Fine-Grained 3D Shape Classification Based on Vision Transformer

FG3DFormer: Fine-Grained 3D Shape Classification Based on Vi...

引用

International Conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Xiangyu Ma Jing Bai Jinzhe Jiang Bin Peng The School of Computer Science and Engineering North Minzu University The Key Laboratory of Images Processing and Pattern Recognition Laboratory Yinchuan China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

Fine-grained 3D shape classification (FGSC) remains challenging due to the difficulty of adaptively capturing global structure differences and subtle inter-class distinctions. This paper directly extends Vision Transformer (ViT) to FGSC, proposing a pure Transformer network FG3DFormer that fully leverages ViT’s global correlation and local attention abilities. FG3Dformer comprises the Hierarchical Feature Extraction (HFE) and the Hierarchical Feature Refinement (HFR), interconnected through the Adaptive View Region Selection (AVRS). Firstly, the HFE comprehensively evaluates the significance of intra-view patches and views driven by inter-view and intraview attention. Then, the AVRS adaptively selects crucial patch Tokens from different views to serve as sources of subtle local features. Finally, the HFR refines the 3D shape descriptor, capturing more discriminative global and subtle local features by leveraging both the view and selected crucial patch Tokens. Extensive experiments on FG3D and ModelNet40 demonstrate the superiority of FG3Dformer in FGSC and meta-category 3D shape classification tasks.

关键词： Computer vision Visualization Solid modeling Three-dimensional displays Correlation Shape Signal processing Transformers Feature extraction Speech processing

来源：评论

学校读者我要写书评

暂无评论

C2DFL: cross-view cross-layer discriminative feature learning for fine-grained 3D shape classification

引用

Neural Computing and Applications 2025年

作者： Jiang, Jinzhe Bai, Jing Ma, Xiangyu The School of Computer Science and Engineering North Minzu University Yinchuan750021 China The Key Laboratory of Images Processing and Pattern Recognition Laboratory North Minzu University Yinchuan750021 China

Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from individual views of 3D shapes, with a focus on various strategies for fusing these extracted view features. However, this approach neglects interview correlations and potential redundancies among different views. In this study, we introduce C2DFL, which consists of two primary modules: cross-view discriminative feature extraction (CV-DFE) and cross-layer discriminative feature fusion (CL-DFF). CV-DFE integrates discriminative features by merging inputs from multiple views, mitigating limitations associated with isolated feature extraction. CL-DFF dynamically selects key tokens using a transformer model to interactively fuse discriminative features from various levels. Extensive experiments conducted on three categories of the FG3D dataset demonstrate the exceptional efficacy of C2DFL in capturing and integrating discriminative features of 3D shapes. The proposed method achieves state-of-the-art accuracy in fine-grained 3D shape classification (FGSC). © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.

关键词： 3D shape classification Cross-layer feature fusion Cross-view attention Fine-grained classification Token selection Transformer

来源：评论

学校读者我要写书评

暂无评论

DLS-HCAN: Duplex Label Smoothing Based Hierarchical Context-Aware Network for Fine-grained 3D Shape Classification

引用

IEEE Transactions on Multimedia 2025年

作者： Bai, Shaojin Zheng, Liang Bai, Jing Ma, Xiangyu North Minzu University School of Computer Science and Engineering Yinchuan750021 China Liupanshan Laboratory Yinchuan750021 China North Minzu University Key Laboratory of Images Processing and Pattern Recognition LaboratoryCommission: IPPRLab Yinchuan750021 China

Fine-grained 3D shape classification (FGSC) has garnered significant attention recently and has made notable advancements. However, due to high inter-class similarity and intra-class diversity, it is still a challenge for existing methods to capture subtle differences between different subcategories for FGSC. On the one hand, one-hot labels in loss function are too hard to describe the above data characteristics, and on the other hand, local details are submerged in the global features extraction process and final network constraints, impacting classification results. In this paper, we propose a duplex label smoothingbased hierarchical context-aware network for fine-grained 3D shape classification, named DLS-HCAN. Specifically, DLS-HCAN firstly employs a hierarchical context-aware network (HCAN), in which the intra-view context attention mechanism (intra-ATT) and the inter-view context multilayer perceptron (inter-MLP) are designed to focus on and discern the beneficial local details. Subsequently, we propose a novel duplex label smoothing (DLS) regularization in which shape-level and view-level smooth labels are separately applied in two improved loss functions, adapting to the fine-grained data characteristics and considering the varying uniqueness of different views. Notably, our approach does not require additional annotation information. Experimental results and comparison with state-of-the-art methods demonstrate the superiority of our proposed DLS-HCAN for FGSC. In addition, our approach also achieves comparable performance for the coarse-grained dataset on ModelNet40. © 2025 IEEE.

关键词： Coarse-grained modeling

来源：评论

学校读者我要写书评

暂无评论

Visual Prompt Flexible-Modal Face Anti-Spoofing

引用

IEEE Transactions on Dependable and Secure Computing 2025年第3期22卷 2597-2606页

作者： Yu, Zitong Cai, Rizhao Cui, Yawen Liu, Ajian Chen, Changsheng Great Bay University School of Computing and Information Technology Dongguan523000 China Nanyang Technological University ROSE Lab School of EEE 639798 Singapore Hong Kong Polytechnic University Kowloon Hong Kong Chinese Academy of Sciences University of Chinese Academy of Sciences National Laboratory of Pattern Recognition Institute of Automation Beijing100190 China Shenzhen University Guangdong Key Laboratory of Intelligent Information Processing Shenzhen Key Laboratory of Media Security College of Electronics and Information Engineering Shenzhen518060 China

Recently, vision transformer based multimodal learning methods have been proposed to improve the robustness of face anti-spoofing (FAS) systems. However, multimodal face data collected from the real world is often imperfect due to missing modalities from various imaging sensors. Recently, flexible-modal FAS (Yu et al. 2023) has attracted more attention, which aims to develop a unified multimodal FAS model using complete multimodal face data but is insensitive to test-time missing modalities. In this paper, we tackle one main challenge in flexible-modal FAS, i.e., when missing modality occurs either during training or testing in real-world situations. Inspired by the recent success of the prompt learning in language models, we propose Visual Prompt flexible-modal FAS (VP-FAS), which learns the modal-relevant prompts to adapt the frozen pre-trained foundation model to downstream flexible-modal FAS task. Specifically, both vanilla visual prompts and residual contextual prompts are plugged into multimodal transformers to handle general missing-modality cases, while only requiring less than 4% learnable parameters compared to training the entire model. Furthermore, missing-modality regularization is proposed to force models to learn consistent multimodal feature embeddings when missing partial modalities. Extensive experiments conducted on two multimodal FAS benchmark datasets demonstrate the effectiveness of our VP-FAS framework that improves the performance under various missing-modality cases while alleviating the requirement of heavy model re-training. © 2004-2012 IEEE.

关键词： Alignment

来源：评论

学校读者我要写书评

暂无评论

Reflecting topology consistency and abnormality via learnable attentions for airway labeling

引用

International Journal of Computer Assisted Radiology and Surgery 2025年 1-9页

作者： Li, Chenyu Zhang, Minghui Zhang, Chuyan Gu, Yun Institute of Medical Robotics Shanghai Jiao Tong University Shanghai China Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai China Shanghai Key Laboratory of Flexible Medical Robotics Tongren Hospital Shanghai Jiao Tong University Shanghai China

Purpose: Accurate airway anatomical labeling is crucial for clinicians to identify and navigate complex bronchial structures during bronchoscopy. Automatic airway labeling is challenging due to significant anatomical variations. Previous methods are prone to generate inconsistent predictions, hindering preoperative planning and intraoperative navigation. This paper aims to enhance topological consistency and improve the detection of abnormal airway branches. Methods: We propose a transformer-based framework incorporating two modules: the soft subtree consistency (SSC) and the abnormal branch saliency (ABS). The SSC module constructs a soft subtree to capture clinically relevant topological relationships, allowing for flexible feature aggregation within and across subtrees. The ABS module facilitates interaction between node features and prototypes to distinguish abnormal branches, preventing the erroneous features aggregation between normal and abnormal nodes. Results: Evaluated on a challenging dataset characterized by severe airway deformities, our method achieves superior performance compared to state-of-the-art approaches. Specifically, it attains an 83.7% subsegmental accuracy, along with a 3.1% increase in segmental subtree consistency, a 45.2% increase in abnormal branch recall. Notably, the method demonstrates robust performance in cases with airway deformities, ensuring consistent and accurate labeling. Conclusion: The enhanced topological consistency and robust identification of abnormal branches provided by our method offer an accurate and robust solution for airway labeling, with potential to improve the precision and safety of bronchoscopy procedures. © CARS 2025.

关键词： Airway anatomical labeling Anomaly detection Structural prior Transformer

来源：评论

学校读者我要写书评

暂无评论

Ocean archaea PPI prediction with pretraining models 25

Ocean archaea PPI prediction with pretraining models

引用

Proceedings of the 2025 5th International Conference on Bioinformatics and Intelligent Computing

作者： Ying Zhang Yuan Liu Xiaoyong Pan Hongbin Shen Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai China Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai China

ISBN: (纸本)9798400712203

Protein-Protein Interaction (PPI) provides important insights into the metabolic mechanisms of different biological processes. Although PPIs in some organisms have been investigated systematically, PPIs in the ocean archaea remain largely unexplored. But such species have special investigation value since their adaptation to extreme living conditions may generate unique PPIs. In this paper, we aim to characterize and predict PPIs in ocean archaea to advance understanding of their metabolic networks. First, we collect all ocean archaea PPIs with high confidence from STRING database and analyze the PPI network features, including centrality and enrichment analysis. The functional enrichment results of the largest connecting subgraph in the PPI network show most PPIs in our constructed dataset is related to the translation and transcription processes. Then, we generate an equal number of negative PPI pairs, whose members have either different subcellular locations or GO terms. We also use the generated dataset to test the performance of three pretraining methods and their ensemble methods in the binary PPI prediction task. Our results suggest the ensemble methods could be applied to further improve models’ performance. Fine-tuned models trained on the ocean archaea dataset are expected to predict the other ocean archaea PPIs that are not included in the STRING database and get more understanding about the ocean archaea PPI universe.

关键词： Binary PPI prediction

来源：评论

学校读者我要写书评

暂无评论

Discovering the nuclear localization signal universe through a deep learning model with interpretable attention units

Patterns

引用

patterns 2025年

作者： Li, Yi-Fan Pan, Xiaoyong Shen, Hong-Bin Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University and Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai200240 China

We describe NLSExplorer, an interpretable approach for nuclear localization signal (NLS) prediction. By utilizing the extracted information on nuclear-specific sites from the protein language model to assist in NLS detection, NLSExplorer achieves superior performance with greater than 10% improvement in the F1 score compared with existing methods on benchmark datasets and highlights other nuclear transport segments. We applied NLSExplorer to the nucleus-localized proteins in the Swiss-Prot database to extract valuable segments. A comprehensive analysis of these segments revealed a potential NLS landscape and uncovered features of nuclear transport segments across 416 species. This study introduces a powerful tool for exploring the NLS universe and provides a versatile network that can efficiently detect characteristic domains and motifs. © 2025 The Author(s)

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Efficient Image Super-Resolution With Feature Interaction Weighted Hybrid Network

引用

IEEE Transactions on Multimedia 2025年 27卷 2256-2267页

作者： Li, Wenjie Li, Juncheng Gao, Guangwei Deng, Weihong Yang, Jian Qi, Guo-Jun Lin, Chia-Wen Beijing University of Posts and Telecommunications Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing100080 China Shanghai University School of Communication and Information Engineering Shanghai200444 China Nanjing University of Posts and Telecommunications IVIPLab Institute of Advanced Technology Nanjing210046 China Ministry of Education Key Laboratory of Artificial Intelligence Shanghai200240 China Soochow University Provincial Key Laboratory for Computer Information Processing Technology Suzhou215006 China Nanjing University of Science and Technology School of Computer Science and Technology Nanjing210094 China Westlake University Research Center for Industries of the Future School of Engineering Hangzhou310024 China OPPO Research SeattleWA98101 United States National Tsing Hua University Department of Electrical Engineering Institute of Communications Engneering Hsinchu300044 Taiwan

Lightweight image super-resolution aims to reconstruct high-resolution images from low-resolution images using low computational costs. However, existing methods result in the loss of middle-layer features due to activation functions. To minimize the impact of intermediate feature loss on reconstruction quality, we propose a Feature Interaction Weighted Hybrid Network (FIWHN), which comprises a series of Wide-residual Distillation Interaction Block (WDIB) as the backbone. Every third WDIB forms a Feature Shuffle Weighted Group (FSWG) by applying mutual information shuffle and fusion. Moreover, to mitigate the negative effects of intermediate feature loss, we introduce Wide Residual Weighting units within WDIB. These units effectively fuse features of varying levels of detail through a Wide-residual Distillation Connection (WRDC) and a Self-Calibrating Fusion (SCF). To compensate for global feature deficiencies, we incorporate a Transformer and explore a novel architecture to combine CNN and Transformer. We show that our FIWHN achieves a favorable balance between performance and efficiency through extensive experiments on low-level and high-level tasks. © 1999-2012 IEEE.

关键词： Transformers Convolutional neural networks Computational modeling Feature extraction Training Superresolution Adaptation models Computer architecture Telecommunications Image reconstruction

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：