版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Oklahoma Sch Comp Sci 110 W Boyd St Norman OK 73019 USA MapLarge Enterprise Architecture 1201 Peachtree St NEBldg 400Suite 1750 Atlanta GA USA Florida State Univ Sch Commun 4100 Univ CtrBldg C Tallahassee FL USA
出 版 物:《MACHINE LEARNING WITH APPLICATIONS》 (Mach. Learn. Appl.)
年 卷 期:2025年第20卷
核心收录:
基 金:U.S. National Science Foundation U.S. Air Force Small Business Innovation Research program
主 题:UML Large language models Machine learning Code generation
摘 要:The Unified Modeling Language is a standardized visual language widely used for modeling and documenting the design of software systems. Although many tools are available that generate UML diagrams from UML code, generating executable UML code from image-based UML diagrams remains challenging. This paper proposes a new approach to generate UML code using a large multimodal language model automatically. Synthetic UML activity and sequence diagram datasets were created to train and test the model. We compared the standard fine-tuning with LoRA techniques to optimize base models. The experiments measured the code generation accuracy across different model sizes and training strategies. These results demonstrated that domain-adapted MM-LLMs perform for UML code generation automation, whereby, at the best model, it achieved BLEU and SSIM of 0.779 and 0.942 on sequence diagrams. This will enable the modernization of legacy systems and decrease the manual effort put into software development workflows.