检索结果-内蒙古大学图书馆

CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection

学校读者我要写书评

暂无评论

CDFI: Cross Domain Feature Interaction for Robust Bronchi Lu...

IEEE International Conference on Robotics and Automation (ICRA)

作者： Jiasheng Xu Tianyi Zhang Yangqian Wu Jie Yang Guang–Zhong Yang Yun Gu Institute of Medical Robotics Shanghai Jiao Tong University Shanghai CHINA Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai CHINA MOE Key Laboratory of System Control and Information Processing Shanghai P.R. China Shanghai Center for Brain Science and Brain-Inspired Technology Shanghai China

Endobronchial intervention is increasingly used as a minimally invasive means for the treatment of pulmonary diseases. In order to reduce the difficulty of manipulation in complex airway networks, robust lumen detection is essential for intraoperative guidance. However, these methods are sensitive to visual artifacts which are inevitable during the surgery. In this work, a cross domain feature interaction (CDFI) network is proposed to extract the structural features of lumens, as well as to provide artifact cues to characterize the visual features. To effectively extract the structural and artifact features, the Quadruple Feature Constraints (QFC) module is designed to constrain the intrinsic connections of samples with various imaging-quality. Furthermore, we design a Guided Feature Fusion (GFF) module to supervise the model for adaptive feature fusion based on different types of artifacts. Results show that the features extracted by the proposed method can preserve the structural information of lumen in the presence of large visual variations, bringing much-improved lumen detection accuracy.

关键词：

Optimal Initialization Conditions Discovery to Improve Clustering Based image Segmentation

学校读者我要写书评

暂无评论

SSRN

SSRN 2022年

作者： Khan, Zubair Khan, Tehreem Yang, Jie Tu, Enmei Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai China Institute of Medical Robotics Shanghai Jiao Tong University Shanghai China University of Central Punjab Bahawalpur Pakistan

A B S T R A C TIn this paper, we propose a solution to the most important factor that deteriorates the performance of the clustering algorithms. We propose an efficient approach for discovering optimal initialization conditions for clustering-driven high-quality image segmentation (EAIS). The proposed approach solves the key clustering issue by uniquely utilizing image histograms to determine the optimal initialization conditions for pixel clusters (segmentation). The proposed approach comprises five modules, deep image reconstruction, intra-histogram peaks determination, inter-histogram peaks association, mutual consensus-oriented cluster seeds merging, and morphological reconstruction-driven spatial post-processing. Diverse experimental results on the BSDS500 benchmark validate that our proposed approach outperforms state-of-the-art (SOTA) methods regarding segmentation quality and computational efficiency. © 2022, The Authors. All rights reserved.

关键词： Merging

MambaMIM: Pre-training Mamba with State Space Token-interpolation

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Tang, Fenghe Nian, Bingkun Li, Yingtai Yang, Jie Wei, Liu Zhou, S. Kevin School of Biomedical Engineering Division of Life Sciences and Medicine University of Science and Technology of China Anhui Hefei230026 China Suzhou Institute for Advanced Research University of Science and Technology of China Jiangsu Suzhou215123 China Department of Automation Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai China

Generative self-supervised learning demonstrates outstanding representation learning capabilities in both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). However, there are currently no generative pre-training methods related to selective state space models (Mamba) that can handle long-range dependencies effectively. To address this challenge, we introduce a generative self-supervised learning method for Mamba (MambaMIM) based on Selective Structure State Space Sequence Token-interpolation (S6T), a general-purpose pre-training method for arbitrary Mamba architectures. Our method, MambaMIM, incorporates a bottom-up 3D hybrid masking strategy in the encoder to maintain masking consistency across different architectures. Additionally, S6T is employed to learn causal relationships between the masked sequence in the state space. MambaMIM can be used on any single or hybrid Mamba architectures to enhance the Mamba long-range representation capability. Extensive downstream experiments reveal the feasibility and advancement of using Mamba for pretraining medical image tasks. The code is available at: https://***/FengheTan9/MambaMIM. Copyright © 2024, The Authors. All rights reserved.

关键词： Self-supervised learning

Robot Debater: Debate-styled Text Auto-generation System Based on Large Foundation Language Models

学校读者我要写书评

暂无评论

Robot Debater: Debate-styled Text Auto-generation System Bas...

pattern recognition and Machine Learning (PRML), IEEE International Conference on

作者： Yu Zhu Yijun Ling Xufeng Ling Jie Yang Shanghai Library (Institute of Scientific and Technical Information of Shanghai) Shanghai China Shanghai Xiangming High School Shanghai China School of Artificial Intelligence Shanghai Normal University Tianhua College Shanghai China Institute of Image Processing and Pattern Recognition Shanghai Jiaotong University Shanghai China

We use a large foundation language model, which is fine-tuned with debate corpora, to develop a robot debater application. To address the limitations of requiring immense computational power in large base language models, this study takes advantage of the Low Rank Adaption characteristic prevalent in domain expert knowledge. By applying Low Rank Adaption technology and fine-tuning with a dedicated dataset, the computational load is reduced to just one-thousandth of what is needed for a large language model, greatly expanding the application scenarios of robot debaters using large language models. In view of the characteristics of debate competitions, this model can preset a variety of debate scenarios and supports personalized debate processes. It employs intelligent voice recognition technology combined with a multi-channel voice input method, allowing for precise localization of different human debaters and improving the accuracy of voice input recognition. The system can support multiple large-scale language generation models and utilize various different voice broadcasting systems, including male and female voice styles, as well as a range of voice emotions. This model can be applied to debate competitions held in universities, high schools, and various industries. It can support human-machine debates as well as machine-to-machine debates.

关键词：

SRSNetwork: Siamese Reconstruction-Segmentation Networks based on Dynamic-Parameter Convolution

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Nian, Bingkun Tang, Fenghe Ding, Jianrui Zhang, Pingping Yang, Jie Kevin Zhou, S. Liu, Wei Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China School of Biomedical Engineering Suzhou Institute for Advanced Research University of Science and Technology of China China School of Computer Science and Technology Harbin Institute of Technology China School of artificial intelligence Dalian University of Technology China

In this paper, we present a high-performance deep neural network for weak target image segmentation, including medical image segmentation and infrared image segmentation. To this end, this work analyzes the existing dynamic convolutions and proposes dynamic parameter convolution (DPConv). Furthermore, it reevaluates the relationship between reconstruction tasks and segmentation tasks from the perspective of DPConv, leading to the proposal of a dual-network model called the Siamese Reconstruction-Segmentation Network (SRSNet). The proposed model is not only a universal network but also enhances the segmentation performance without altering its structure, leveraging the reconstruction task. Additionally, as the amount of training data for the reconstruction network increases, the performance of the segmentation network also improves synchronously. On seven datasets including five medical datasets and two infrared image datasets, our SRSNet consistently achieves the best segmentation results. The code is released at https://***/fidshu/SRSNet. Copyright © 2023, The Authors. All rights reserved.

关键词： Convolution

Hybrid Data-Free Knowledge Distillation

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Tang, Jialiang Chen, Shuo Gong, Chen School of Computer Science and Engineering Nanjing University of Science and Technology China Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education China Jiangsu Key Laboratory of Image and Video Understanding for Social Security China Center for Advanced Intelligence Project RIKEN Japan Department of Automation Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Data-free knowledge distillation aims to learn a compact student network from a pre-trained large teacher network without using the original training data of the teacher network. Existing collection-based and generation-based methods train student networks by collecting massive real examples and generating synthetic examples, respectively. However, they inevitably become weak in practical scenarios due to the difficulties in gathering or emulating sufficient real-world data. To solve this problem, we propose a novel method called Hybrid Data-Free Distillation (HiDFD), which leverages only a small amount of collected data as well as generates sufficient examples for training student networks. Our HiDFD comprises two primary modules, i.e., the teacher-guided generation and student distillation. The teacher-guided generation module guides a Generative Adversarial Network (GAN) by the teacher network to produce high-quality synthetic examples from very few real-world collected examples. Specifically, we design a feature integration mechanism to prevent the GAN from overfitting and facilitate the reliable representation learning from the teacher network. Meanwhile, we drive a category frequency smoothing technique via the teacher network to balance the generative training of each category. In the student distillation module, we explore a data inflation strategy to properly utilize a blend of real and synthetic data to train the student network via a classifier-sharing-based feature alignment technique. Intensive experiments across multiple benchmarks demonstrate that our HiDFD can achieve state-of-the-art performance using 120 times less collected data than existing methods. Code is available at https://***/tangjialiang97/HiDFD. Copyright © 2024, The Authors. All rights reserved.

关键词： Students

Unsupervised Difference Learning for Noisy Rigid image Alignment

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Chen, Yu-Xuan Feng, Dagan Shen, Hong-Bin Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai 200240 China School of Computer Science University of Sydney Sydney2006 Australia

Rigid image alignment is a fundamental task in computer vision, while the traditional algorithms are either too sensitive to noise or time-consuming. Recent unsupervised image alignment methods developed based on spatial transformer networks show an improved performance on clean images but will not achieve satisfactory performance on noisy images due to its heavy reliance on pixel value comparations. To handle such challenging applications, we report a new unsupervised difference learning (UDL) strategy and apply it to rigid image alignment. UDL exploits the quantitative properties of regression tasks and converts the original unsupervised problem to pseudo supervised problem. Under the new UDL-based image alignment pipeline, rotation can be accurately estimated on both clean and noisy images and translations can then be easily solved. Experimental results on both nature and cryo-EM images demonstrate the efficacy of our UDL-based unsupervised rigid image alignment method. Copyright © 2022, The Authors. All rights reserved.

关键词： image enhancement

Application of an Improved Focal Loss in Vehicle Detection 19th

学校读者我要写书评

暂无评论

Application of an Improved Focal Loss in Vehicle Detection

19th International Conference on Artificial Intelligence and Soft Computing, ICAISC 2020

作者： He, Xuanlin Yang, Jie Kasabov, Nikola Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai China Auckland University of Technology Auckland New Zealand

ISBN: (纸本)9783030614003

Object detection is an important and fundamental task in computer vision. Recently, the emergence of deep neural network has made considerable progress in object detection. Deep neural network object detectors can be grouped in two broad categories: the two-stage detector and the one-stage detector. One-stage detectors are faster than two-stage detectors. However, they suffer from a severe foreground-backg-round class imbalance during training that causes a low accuracy performance. RetinaNet is a one-stage detector with a novel loss function named Focal Loss which can reduce the class imbalance effect. Thereby RetinaNet outperforms all the two-stage and one-stage detectors in term of accuracy. The main idea of focal loss is to add a modulating factor to rectify the cross-entropy loss, which down-weights the loss of easy examples during training and thus focuses on the hard examples. However, cross-entropy loss only focuses on the loss of the ground-truth classes and thus it can’t gain the loss feedback from the false classes. Thereby cross-entropy loss does not achieve the best convergence. In this paper, we proposed a new loss function named Dual Cross-Entropy Focal Loss, which improves on the focal loss. Dual cross-entropy focal loss adds a modulating factor to rectify the dual cross-entropy loss towards focusing on the hard samples. Dual cross-entropy loss is an improved variant of cross-entropy loss, which gains the loss feedback from both the ground-truth classes and the false classes. We changed the loss function of RetinaNet from focal loss to our dual cross-entropy focal loss and performed some experiments on a small vehicle dataset. The experimental results show that our new loss function improves the vehicle detection performance. © 2020, Springer Nature Switzerland AG.

关键词： Object detection

Function Projective Lag Synchronization of Chaotic Systems with Certain Parameters via Adaptive-impulsive Control

学校读者我要写书评

暂无评论

International Journal of Automation and computing 2019年第2期16卷 238-247页

作者： Xiu-Li Chai Zhi-Hua Gan Institute of Image Processing and Pattern Recognition Henan University Institute of Intelligent Network System School of SoftwareHenan University

A new method is presented to study the function projective lag synchronization(FPLS) of chaotic systems via adaptive-impulsive control. To achieve synchronization, suitable nonlinear adaptive-impulsive controllers are designed. Based on the Lyapunov stability theory and the impulsive control technology, some effective sufficient conditions are derived to ensure the drive system and the response system can be rapidly lag synchronized up to the given scaling function matrix. Numerical simulations are presented to verify the effectiveness and the feasibility of the analytical results.

关键词： Function projective lag synchronization (FPLS) adaptive-impulsive chaotic systems numerical simulation Lyapunov stability theory