检索结果-内蒙古大学图书馆

arXiv 2025年

作者： Shao, Liangjing Chen, Benshuang Zhao, Shuting Chen, Xinrong Academy for Engineering & Technology Fudan University China Shanghai Key Laboratory of Medical Image Computing and Computer-Assisted Intervention Fudan University China

Real-time ego-motion tracking for endoscope is a significant task for efficient navigation and robotic automation of endoscopy. In this paper, a novel framework is proposed to perform real-time ego-motion tracking for endoscope. Firstly, a multi-modal visual feature learning network is proposed to perform relative pose prediction, in which the motion feature from the optical flow, the scene features and the joint feature from two adjacent observations are all extracted for prediction. Due to more correlation information in the channel dimension of the concatenated image, a novel feature extractor is designed based on an attention mechanism to integrate multi-dimensional information from the concatenation of two continuous frames. To extract more complete feature representation from the fused features, a novel pose decoder is proposed to predict the pose transformation from the concatenated feature map at the end of the framework. At last, the absolute pose of endoscope is calculated based on relative poses. The experiment is conducted on three datasets of various endoscopic scenes and the results demonstrate that the proposed method outperforms state-of-the-art methods. Besides, the inference speed of the proposed method is over 30 frames per second, which meets the real-time requirement. The project page is here: *** © 2025, CC BY.

关键词： Endoscopy

来源：评论

学校读者我要写书评

暂无评论

Exploring CLIP’s Dense Knowledge for Weakly Supervised Semantic Segmentation

arXiv

引用

arXiv 2025年

作者： Yang, Zhiwei Meng, Yucong Fu, Kexue Tang, Feilong Wang, Shuo Song, Zhijian Academy for Engineering and Technology Fudan University Shanghai200433 China Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention China Digital Medical Research Center School of Basic Medical Sciences Fudan University China Shandong Computer Science Center National Supercomputer Center in Jinan China

Weakly Supervised Semantic Segmentation (WSSS) with image-level labels aims to achieve pixel-level predictions using Class Activation Maps (CAMs). Recently, Contrastive Language-image Pre-training (CLIP) has been introduced in WSSS. However, recent methods primarily focus on image-text alignment for CAM generation, while CLIP’s potential in patch-text alignment remains unexplored. In this work, we propose ExCEL to explore CLIP’s dense knowledge via a novel patch-text alignment paradigm for WSSS. Specifically, we propose Text Semantic Enrichment (TSE) and Visual Calibration (VC) modules to improve the dense alignment across both text and vision modalities. To make text embeddings semantically informative, our TSE module applies Large Language Models (LLMs) to build a dataset-wide knowledge base and enriches the text representations with an implicit attribute-hunting process. To mine fine-grained knowledge from visual features, our VC module first proposes Static Visual Calibration (SVC) to propagate fine-grained knowledge in a non-parametric manner. Then Learnable Visual Calibration (LVC) is further proposed to dynamically shift the frozen features towards distributions with diverse semantics. With these enhancements, ExCEL not only retains CLIP’s training-free advantages but also significantly outperforms other state-of-the-art methods with much less training cost on PASCAL VOC and MS COCO. Code is available at https://***/zwyang6/ExCEL. Copyright © 2025, The Authors. All rights reserved.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

DM-Mamba: Dual-domain Multi-scale Mamba for MRI Reconstruction

arXiv

引用

arXiv 2025年

作者： Meng, Yucong Yang, Zhiwei Song, Zhijian Shi, Yonghong Fu, Kexue Digital Medical Research Center School of Basic Medical Science Fudan University Shanghai200032 China Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention Shanghai200032 China Academy of Engineering and Technology Fudan University Shanghai200433 China Jinan China

The accelerated MRI reconstruction poses a challenging ill-posed inverse problem due to the significant undersampling in k-space. Deep neural networks, such as CNNs and ViT, have shown substantial performance improvements for this task while encountering the dilemma between global receptive fields and efficient computation. To this end, this paper pioneers exploring Mamba, a new paradigm for long-range dependency modeling with linear complexity, for efficient and effective MRI reconstruction. However, directly applying Mamba to MRI reconstruction faces three significant issues: (1) Mamba’s row-wise and column-wise scanning disrupts k-space’s unique spectrum, leaving its potential in k-space learning unexplored. (2) Existing Mamba methods unfold feature maps with multiple lengthy scanning paths, leading to long-range forgetting and high computational burden. (3) Mamba struggles with spatially-varying contents, resulting in limited diversity of local representations. To address these, we propose a dual-domain multi-scale Mamba for MRI reconstruction from the following perspectives: (1) We pioneer vision Mamba in k-space learning. A circular scanning is customized for spectrum unfolding, benefiting the global modeling of k-space. (2) We propose a multi-scale Mamba with an efficient scanning strategy in both image and k-space domains. It mitigates long-range forgetting and achieves a better trade-off between efficiency and performance. (3) We develop a local diversity enhancement module to improve the spatially-varying representation of Mamba. Extensive experiments are conducted on three public datasets for MRI reconstruction under various undersampling patterns. Comprehensive results demonstrate that our method significantly outperforms state-of-the-art methods with lower computational cost. Implementation code will be available in https://***/XiaoMengLiLiLi/DM-Mamba. Copyright © 2025, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Weakly Semi-supervised Whole Slide image Classification by Two-level Cross Consistency Supervision

arXiv

引用

arXiv 2025年

作者： Qu, Linhao Li, Shiman Luo, Xiaoyuan Liu, Shaolei Guo, Qinhao Wang, Manning Song, Zhijian Digital Medical Research Center School of Basic Medical Science Fudan University Shanghai China Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention Shanghai China Department of Gynecologic Oncology Shanghai Cancer Center Fudan University Shanghai China Department of Oncology Shanghai Medical College Fudan University Shanghai China

computer-aided Whole Slide image (WSI) classification has the potential to enhance the accuracy and efficiency of clinical pathological diagnosis. It is commonly formulated as a Multiple Instance Learning (MIL) problem, where each WSI is treated as a bag and the small patches extracted from the WSI are considered instances within that bag. However, obtaining labels for a large number of bags is a costly and time-consuming process, particularly when utilizing existing WSIs for new classification tasks. This limitation renders most existing WSI classification methods ineffective. To address this issue, we propose a novel WSI classification problem setting, more aligned with clinical practice, termed Weakly Semi-supervised Whole slide image Classification (WSWC). In WSWC, a small number of bags are labeled, while a significant number of bags remain unlabeled. The MIL nature of the WSWC problem, coupled with the absence of patch labels, distinguishes it from typical semi-supervised image classification problems, making existing algorithms for natural images unsuitable for directly solving the WSWC problem. In this paper, we present a concise and efficient framework, named CroCo, to tackle the WSWC problem through two-level Cross Consistency supervision. CroCo comprises two heterogeneous classifier branches capable of performing both instance classification and bag classification. The fundamental idea is to establish cross-consistency supervision at both the bag-level and instance-level between the two branches during training. Extensive experiments conducted on four datasets demonstrate that CroCo achieves superior bag classification and instance classification performance compared to other comparative methods when limited WSIs with bag labels are available. To the best of our knowledge, this paper presents for the first time the WSWC problem and gives a successful resolution. Copyright © 2025, The Authors. All rights reserved.

关键词： Semi-supervised learning

来源：评论

学校读者我要写书评

暂无评论

DDFP: Data-dependent frequency prompt for source free domain adaptation of medical image segmentation

引用

Knowledge-Based Systems 2025年 324卷

作者： Siqi Yin Shaolei Liu Manning Wang Digital Medical Research Center School of Basic Medical Science Fudan University Shanghai 200032 China Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention Shanghai 200032 China

Domain adaptation addresses the challenge of model performance degradation caused by domain gaps. In the typical setup for unsupervised domain adaptation, labeled data from a source domain and unlabeled data from a target domain are used to train a target model. However, access to labeled source domain data, particularly in medical datasets, can be restricted due to privacy policies. As a result, research has increasingly shifted to source-free domain adaptation (SFDA), which requires only a pretrained model from the source domain and unlabeled data from the target domain data for adaptation. Existing SFDA methods often rely on domain-specific image style translation and self-supervision techniques to bridge the domain gap and train the target domain model. However, the quality of domain-specific style-translated images and pseudo-labels produced by these methods still leaves room for improvement. Moreover, training the entire model during adaptation can be inefficient under limited supervision. In this paper, we propose a novel SFDA framework to address these challenges. Specifically, to effectively mitigate the impact of domain gap in the initial training phase, we introduce preadaptation to generate a preadapted model, which serves as an initialization of target model and allows for the generation of high-quality enhanced pseudo-labels without introducing extra parameters. Additionally, we propose a data-dependent frequency prompt to more effectively translate target domain images into a source-like style. To further enhance adaptation, we employ a style-related layer fine-tuning strategy, specifically designed for SFDA, to train the target model using the prompted target domain images and pseudo-labels. Extensive experiments on cross-modality abdominal and cardiac SFDA segmentation tasks demonstrate that our proposed method outperforms existing state-of-the-art methods. Our code is available online.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Decoupled deep hough voting for point cloud registration

引用

Frontiers of computer Science 2024年第2期18卷 147-155页

作者： Mingzhi YUAN Kexue FU Zhihao LI Manning WANG Digital Medical Research Center School of Basic Medical SciencesFudan UniversityShanghai 200032China Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention Shanghai 200032China

Estimating rigid transformation using noisy correspondences is critical to feature-based point cloud ***,a series of studies have attempted to combine traditional robust model fitting with deep *** them,DHVR proposed a hough voting-based method,achieving new state-of-the-art ***,we find voting on rotation and translation simultaneously hinders achieving better ***,we proposed a new hough voting-based method,which decouples rotation and translation ***,we first utilize hough voting and a neural network to estimate *** based on good initialization on rotation,we can easily obtain accurate rigid *** experiments on 3DMatch and 3DLoMatch datasets show that our method achieves comparable performances over the state-of-the-art *** further demonstrate the generalization of our method by experimenting on KITTI dataset.

关键词： point cloud registration robust model fitting deep learning hough voting

来源：评论

学校读者我要写书评

暂无评论

A learnable self-supervised task for unsupervised domain adaptation on point cloud classification and segmentation

引用

Frontiers of computer Science 2023年第6期17卷 147-149页

作者： Shaolei LIU Xiaoyuan LUO Kexue FU Manning WANG Zhijian SONG Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention Shanghai 200032China Digital Medical Research Center School of Basic Medical ScienceFudan UniversityShanghai 200032China

1 Introduction Deep neural networks have exhibited excellent performance in supervised tasks on point clouds,such as classification,segmentation[1]and registration[2].In supervised learning schemes,manual labeling of massive point clouds is needed for model ***,point clouds captured from different scenarios exist inevitable distribution discrepancy,and model trained from one domain always generalize badly in another *** reduce the doamin distribution discrepancy,many studies[3–6]have emerged for point cloud unsupervised domain adaptation(UDA)by learning domain-invariant features,where[5]proposed using adaptive nodes to align the local features between the source and the target domains[3,4],and[6]proposed utilizing self-supervised tasks to help capture highly transferable feature representations.

关键词： point cloud utilizing

来源：评论

学校读者我要写书评

暂无评论

SS-Pro:a simplified siamese contrastive learning approach for protein surface representation

引用

Frontiers of computer Science 2024年第5期18卷 243-245页

作者： Ao SHEN Mingzhi YUAN Yingfan MA Manning WANG Digital Medical Research Center School of Basic Medical ScienceFudan UniversityShanghai 200032China Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention Shanghai 200032China

Protein surface serves as an important representation of protein structure,revealing how protein interacts with other biomolecules to perform its *** forms the basis for pharmaceutical and fundamental biological research[1].Datadriven deep learning methods in protein surface representation face challenges of label scarcity,since labeled data are typically obtained through wet lab experiments.

关键词： revealing representation simplified

来源：评论

学校读者我要写书评

暂无评论

An efficient dual-branch framework via implicit self-texture enhancement for arbitrary-scale histopathology image super-resolution

引用

Scientific Reports 2025年第1期15卷 1-18页

作者： Minghong Duan Linhao Qu Manning Wang Chenxi Zhang Zhijian Song Zhiwei Yang Digital Medical Research Center School of Basic Medical Sciences Fudan University Shanghai 200032 China Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention Shanghai 200032 China Academy for Engineering and Technology Fudan University Shanghai 200433 China

High-quality whole-slide scanning is expensive, complex, and time-consuming, thus limiting the acquisition and utilization of high-resolution histopathology images in daily clinical work. Deep learning-based single-image super-resolution (SISR) techniques provide an effective way to solve this problem. However, the existing SISR models applied in histopathology images can only work in fixed integer scaling factors, decreasing their applicability. Though methods based on implicit neural representation (INR) have shown promising results in arbitrary-scale super-resolution (SR) of natural images, applying them directly to histopathology images is inadequate because they have unique fine-grained image textures different from natural images. Thus, we propose an Implicit Self-Texture Enhancement-based dual-branch framework (ISTE) for arbitrary-scale SR of histopathology images to address this challenge. The proposed ISTE contains a feature aggregation branch and a texture learning branch. We employ the feature aggregation branch to enhance the learning of the local details for SR images while utilizing the texture learning branch to enhance the learning of high-frequency texture details. Then, we design a two-stage texture enhancement strategy to fuse the features from the two branches to obtain the SR images. Experiments on publicly available datasets, including TMA, HistoSR, and the TCGA lung cancer datasets, demonstrate that ISTE outperforms existing fixed-scale and arbitrary-scale SR algorithms across various scaling factors. Additionally, extensive experiments have shown that the histopathology images reconstructed by the proposed ISTE are applicable to downstream pathology image analysis tasks.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Evaluation of uncertainty estimation methods in medical image segmentation: Exploring the usage of uncertainty in clinical deployment

引用

computerized medical Imaging and Graphics 2025年 124卷

作者： Shiman Li Mingzhi Yuan Xiaokun Dai Chenxi Zhang Digital Medical Research Center School of Basic Medical Science Fudan University Shanghai 200032 China Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention Shanghai 200032 China Digital Medical Research Center Academy for Engineering and Technology Fudan University Shanghai 200032 China

Uncertainty estimation methods are essential for the application of artificial intelligence (AI) models in medical image segmentation, particularly in addressing reliability and feasibility challenges in clinical deployment. Despite their significance, the adoption of uncertainty estimation methods in clinical practice remains limited due to the lack of a comprehensive evaluation framework tailored to their clinical usage. To address this gap, a simulation of uncertainty-assisted clinical workflows is conducted, highlighting the roles of uncertainty in model selection, sample screening, and risk visualization. Furthermore, uncertainty evaluation is extended to pixel, sample, and model levels to enable a more thorough assessment. At the pixel level, the Uncertainty Confusion Metric (UCM) is proposed, utilizing density curves to improve robustness against variability in uncertainty distributions and to assess the ability of pixel uncertainty to identify potential errors. At the sample level, the Expected Segmentation Calibration Error (ESCE) is introduced to provide more accurate calibration aligned with Dice, enabling more effective identification of low-quality samples. At the model level, the Harmonic Dice (HDice) metric is developed to integrate uncertainty and accuracy, mitigating the influence of dataset biases and offering a more robust evaluation of model performance on unseen data. Using this systematic evaluation framework, five mainstream uncertainty estimation methods are compared on organ and tumor datasets, providing new insights into their clinical applicability. Extensive experimental analyses validated the practicality and effectiveness of the proposed metrics. This study offers clear guidance for selecting appropriate uncertainty estimation methods in clinical settings, facilitating their integration into clinical workflows and ultimately improving diagnostic efficiency and patient outcomes.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：