咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Foundational Large Language Mo... 收藏
arXiv

Foundational Large Language Models for Materials Research

作     者:Mishra, Vaibhav Singh, Somaditya Ahlawat, Dhruv Zaki, Mohd Bihani, Vaibhav Grover, Hargun Singh Mishra, Biswajit Miret, Santiago Mausam Krishnan, N.M. Anoop 

作者机构:Department of Computer Science and Engineering Department of Civil Engineering Yardi School of Artificial Intelligence Indian Institute of Technology Delhi India Cerebras Systems Inc. United States Intel Labs 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Crystallography 

摘      要:Materials discovery and development are critical for addressing global challenges in renewable energy, sustainability, and advanced technology. Yet, the exponential growth in materials science literature comprising vast amounts of textual data has created significant bottlenecks in knowledge extraction, synthesis, and scientific reasoning. Large Language Models (LLMs) offer unprecedented opportunities to accelerate materials research through automated analysis and prediction. Still, their effective deployment for materials discovery requires domain-specific adaptation for language understanding and solving domain-relevant tasks. Here, we present LLaMat, a family of foundational models for materials science, developed through continued pretraining of LLaMA models on an extensive corpus of materials literature and crystallographic data, followed by instruction- and task-finetuning. Through systematic evaluation, we demonstrate that LLaMat excels in materials-specific natural language processing and structured information extraction tasks outperforming commercial LLMs, while maintaining general linguistic capabilities. The specialized LLaMat-CIF variant demonstrates remarkable capabilities in crystal structure generation, predicting stable crystals with high coverage across the periodic table. Intriguingly, despite LLaMA-3’s superior performance in comparison to LLaMA-2, we observe that LLaMat-2 demonstrates unexpectedly enhanced domain-specific performance across diverse materials science tasks, including structured information extraction from text and tables and crystal structure generation. These results point to a potential adaptation rigidity in overtrained LLMs such as LLaMA-3. Altogether, the present work demonstrates the effectiveness of domain adaptation towards the development of practically deployable LLM copilots for materials research. Beyond materials science, our findings reveal important considerations for domain adaptation of LLMs—model selection, tr

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分