Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availabil...
详细信息
ISBN:
(纸本)9783031732928;9783031732904
Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multi-modal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base that can be adapted to a variety of specific tasks and contexts. Here, we present VIsualization and segmentation Masked AutoEncoder (VIS-MAE), novel model weights specifically designed for medical imaging. Specifically, VIS-MAE is trained on a dataset of 2.5 million unlabeled images from various modalities (CT, MR, PET, X-rays, and ultrasound), using self-supervised learning techniques. It is then adapted to classification and segmentation tasks using explicit labels. VIS-MAE has high label efficiency, outperforming several benchmark models in both in-domain and out-of-domain applications. In addition, VIS-MAE has improved label efficiency as it can achieve similar performance to other models with a reduced amount of labeled training data (50% or 80%) compared to other pre-trained weights. VIS-MAE represents a significant advancement in medical imaging AI, offering a generalizable and robust solution for improving segmentation and classification tasks while reducing the data annotation workload. The source code of this work is available at https://***/lzl199704/VIS-MAE.
With the advent of deep neural networks, medicalimage analysis is able to predict results in advance in early detection and diagnosis of diseases found in the human body. Several deep neural network methodologies hav...
详细信息
With the advent of deep neural networks, medicalimage analysis is able to predict results in advance in early detection and diagnosis of diseases found in the human body. Several deep neural network methodologies have been implemented for a quick and efficient analysis of medicalimages that detect and diagnose cancerous cell growth in any part of the human body. For improving the segmentation and classification accuracy, the paper has proposed a framework comprising modified DoubleU-Net for imagesegmentation and PolyNet architecture for imageclassification. The modified DoubleU-Net is composed of two U-Net architectures, in which U-Net1 makes use of ResNet-50 as an encoder in the place of VGG-16 (existing) and Atrous Spatial Pyramid Pooling (ASPP) is replaced by Waterfall Atrous Spatial Pooling (WASP) architecture in both U-Nets to improve the semantic imagesegmentation. For classifying the segmented medicalimages as benign or malignant, PolyNet architecture is implemented in the research. The research involves experiments on the brain tumor dataset and lung cancer dataset to analyze the performance of the proposed approach. The processing of the DoubleU-Net and modified DoubleU-Net is evaluated based on precision, Recall, Intersection over Union (IoU), and Dice Score as the performance metrics. Experimental findings indicate that the modified DoubleU-Net design outperformed the existing DoubleU-Net architecture in terms of performance parameters for segmentation. The efficiency of the PolyNet classifier has been evaluated against VGG-16 and Inception-V3 classifiers, in terms of accuracy, specificity, sensitivity, error rate, and computation time as the performance metrics. From the experimental results, it has been proved that the PolyNet classifier performs better than VGG-16 and Inception-V3 with improved accuracy, specificity, sensitivity, and computation time.
Multi-task learning for joint medical image segmentation and classification holds promise for enhancing diagnostic accuracy and reliability in clinical settings. Current approaches often rely on unidirectionally boots...
详细信息
Multi-task learning for joint medical image segmentation and classification holds promise for enhancing diagnostic accuracy and reliability in clinical settings. Current approaches often rely on unidirectionally bootstrapping one task with a single high-level feature from another, which fails to leverage valuable information fully and can lead to suboptimal outcomes and diagnostic errors. Multi-task learning frameworks based on Convolutional Neural Networks (CNNs) or Vision Transformers (ViTs) have achieved significant success in medicalimage analysis. However, CNNs struggle with long sequence information, and ViTs are computationally intensive. Based on this, in this paper, we propose a novel Uncertainty Bidirectional Guidance of multi-task Mamba network (UBGM) for efficient and reliable medicalimage analysis. UBGM's encoder utilizes a Mamba structure, excelling in remote modeling and maintaining computational efficiency with linear complexity. The uncertainty coarse segmentation guidance module performs interactive learning between tasks, generating multiple high-level features for classification and coarse segmentation results by incorporating uncertainty. To better utilize segmentation information, we designed an uncertainty classification decoder to produce category information and features for assisted segmentation correction. Real bidirectional guidance is achieved by the mutual assistance of both tasks, improving model performance. Experiments on public datasets demonstrate that UBGM outperforms existing benchmark models, showing potential for high performance and reliability in multi-task networks.
The Coronavirus disease 2019 (COVID-19) pandemic has presented unprecedented challenges to global healthcare systems, urgently calling for innovative diagnostic solutions. This paper introduces the Fully Automatic Det...
详细信息
ISBN:
(纸本)9798350384734;9798350384727
The Coronavirus disease 2019 (COVID-19) pandemic has presented unprecedented challenges to global healthcare systems, urgently calling for innovative diagnostic solutions. This paper introduces the Fully Automatic Detection of Covid-19 cases in medicalimages of the Lung (FADCIL) system, a cutting-edge deep learning framework designed for rapid and accurate COVID-19 diagnosis from chest computed tomography (CT) images. By leveraging an architecture based on YOLO and 3D U-Net, FADCIL, excels in identifying and quantifying lung injuries attributable to COVID-19, distinguishing them from other pathologies. In real -world clinical environments, FADCIL achieves a DICE coefficient above 0.82, highlighting its robust performance and clinical relevance. FADCIL also enhances the reliability of COVID-19 assessment, empowering healthcare professionals to make informed decisions and effectively manage patient care. Thus, this paper outlines the FADCIL architecture and presents an in-depth analysis of quantitative and qualitative evaluation results derived from a novel dataset comprising over 1000 CT scans. Furthermore, we provide access to the FADCIL's source code for public use.
暂无评论