咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Bitext Mining for Low-Resource... 收藏
arXiv

Bitext Mining for Low-Resource Languages via Contrastive Learning

作     者:Tan, Weiting Koehn, Philipp 

作者机构:Center for Language and Speech Processing Computer Science Department Johns Hopkins University United States 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2022年

核心收录:

摘      要:Mining high-quality bitexts for low-resource languages is challenging. This paper shows that sentence representation of language models fine-tuned with multiple negatives ranking loss, a contrastive objective, helps retrieve clean bitexts. Experiments show that parallel data mined from our approach substantially outperform the previous state-of-the-art method on low resource languages Khmer and Pashto. © 2022, CC BY.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分