咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Extending LLMs to New Language... 收藏
arXiv

Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation

作     者:Sani, Samin Mahdizadeh Sadeghi, Pouya Vu, Thuy-Trang Yaghoobzadeh, Yadollah Haffari, Gholamreza 

作者机构:Department of Electrical and Computer Engineering University of Tehran Iran Tehran Institute for Advanced Studies Khatam University Iran Department of Data Science & AI Monash University Australia 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Translation (languages) 

摘      要:Large language models (LLMs) have made great progress in classification and text generation tasks. However, they are mainly trained on English data and often struggle with low-resource languages. In this study, we explore adding a new language, i.e., Persian, to Llama (a model with a limited understanding of Persian) using parameter-efficient fine-tuning. We employ a multi-stage approach involving pretraining on monolingual Persian data, aligning representations through bilingual pretraining and instruction datasets, and instruction-tuning with task-specific datasets. We evaluate the model’s performance at each stage on generation and classification tasks. Our findings suggest that incorporating the Persian language, through bilingual data alignment, can enhance classification accuracy for Persian tasks, with no adverse impact and sometimes even improvements on English tasks. Additionally, the results highlight the model’s initial strength as a critical factor when working with limited training data, with cross-lingual alignment offering minimal benefits for the low-resource language. Knowledge transfer from English to Persian has a marginal effect, primarily benefiting simple classification tasks.1 Copyright © 2024, The Authors. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分