咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Integrating Chain-of-Thought a... 收藏
arXiv

Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

作     者:Wu, Da Wang, Zhanliang Nguyen, Quan Wang, Kai 

作者机构:Raymond G. Perelman Center for Cellular and Molecular Therapeutics Children's Hospital of Philadelphia PhiladelphiaPA19104 United States Applied Mathematics and Computational Science Graduate Program University of Pennsylvania PhiladelphiaPA19104 United States Bioengineering Graduate Program University of Pennsylvania PhiladelphiaPA19104 United States Department of Pathology and Laboratory Medicine Perelman School of Medicine University of Pennsylvania PhiladelphiaPA19104 United States 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2025年

核心收录:

主  题:Chains 

摘      要:Background: Several studies show that large language models (LLMs) struggle with phenotype-driven gene prioritization for rare diseases. These studies typically use Human Phenotype Ontology (HPO) terms to prompt foundation models like GPT and LLaMA to predict candidate genes. However, in real-world settings, foundation models are not optimized for domain-specific tasks like clinical diagnosis, yet inputs are unstructured clinical notes rather than standardized terms. How LLMs can be instructed to predict candidate genes or disease diagnosis from unstructured clinical notes remains a major challenge. Methods: We introduce RAG-driven CoT and CoT-driven RAG, two methods that combine Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to analyze clinical notes. A five-question CoT protocol mimics expert reasoning, while RAG retrieves data from sources like HPO and OMIM (Online Mendelian Inheritance in Man). We evaluated these approaches on rare disease datasets, including 5,980 Phenopacket-derived notes, 255 literature-based narratives, and 220 in-house clinical notes from Children’s Hospital of Philadelphia. Results: We found that recent foundations models, including Llama 3.3-70B-Instruct and DeepSeek-R1-Distill-Llama-70B, outperformed earlier versions such as Llama 2 and GPT-3.5. We also showed that RAG-driven CoT and CoT-driven RAG both outperform foundation models in candidate gene prioritization from clinical notes;in particular, both methods with DeepSeek backbone resulted in a top-10 gene accuracy of over 40% on Phenopacket-derived clinical notes. RAG-driven CoT works better for high-quality notes, where early retrieval can anchor the subsequent reasoning steps in domain-specific evidence, while CoT-driven RAG has advantage when processing lengthy and noisy notes. Conclusions: Integrating CoT and RAG enhances LLMs’ understanding of clinical notes in the context of rare disease diagnosis, and can facilitate various downstream medical tasks. © 2025, CC

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分