咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Text-Guided Synthesis in Medic... 收藏

Text-Guided Synthesis in Medical Multimedia Retrieval: A Framework for Enhanced Colonoscopy Image Classification and Segmentation

作     者:Peter, Ojonugwa Oluwaf Ejiga emi Adeniran, Opeyemi Taiwo John-Otumu, Adetokunbo MacGregor Khalifa, Fahmi Rahman, Md Mahmudur 

作者机构:Morgan State Univ Sch Comp Math & Nat Sci Dept Comp Sci Baltimore MD 21251 USA Morgan State Univ Dept Elect & Comp Engn Baltimore MD 21251 USA Fed Univ Technol Owerri Dept Informat Technol Owerri 460116 Imo Nigeria 

出 版 物:《ALGORITHMS》 (Algorithms)

年 卷 期:2025年第18卷第3期

页      面:155-155页

核心收录:

基  金:National Science Foundation (NSF) Office of the Director, National Institutes of Health (NIH) Common Fund [1OT2OD032581-01] 2131307 CISE-MSI: DP: IIS: III 

主  题:medical imaging synthesis polyp detection text-to-image generation image segmentation generative AI medical image synthesis colorectal cancer detection data augmentation synthetic colonoscopy images DreamBooth Stable Diffusion Low-Rank Adaptation (LoRA) polyp segmentation Feature Pyramid Network Vision Transformer Segment Anything Model medical diagnostic models healthcare AI 

摘      要:The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized Generative Adversarial Network (VQ-GAN)) have been used to generate images but not colonoscopy data for intelligent data augmentation. This study developed an effective method for producing synthetic colonoscopy image data, which can be used to train advanced medical diagnostic models for robust colorectal cancer detection and treatment. Text-to-image synthesis was performed using fine-tuned Visual Large Language Models (LLMs). Stable Diffusion and DreamBooth Low-Rank Adaptation produce images that look authentic, with an average Inception score of 2.36 across three datasets. The validation accuracy of various classification models Big Transfer (BiT), Fixed Resolution Residual Next Generation Network (FixResNeXt), and Efficient Neural Network (EfficientNet) were 92%, 91%, and 86%, respectively. Vision Transformer (ViT) and Data-Efficient Image Transformers (DeiT) had an accuracy rate of 93%. Secondly, for the segmentation of polyps, the ground truth masks are generated using Segment Anything Model (SAM). Then, five segmentation models (U-Net, Pyramid Scene Parsing Network (PSNet), Feature Pyramid Network (FPN), Link Network (LinkNet), and Multi-scale Attention Network (MANet)) were adopted. FPN produced excellent results, with an Intersection Over Union (IoU) of 0.64, an F1 score of 0.78, a recall of 0.75, and a Dice coefficient of 0.77. This demonstrates strong performance in terms of both segmentation accuracy and overlap metrics, with particularly robust results in balanced detection capability as shown by the high F1 score and Dice coefficient. This highlights how AI-generated medical images can improve colonoscopy anal

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分