咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Datastore Distillation for Nea... 收藏

Datastore Distillation for Nearest Neighbor Machine Translation

作     者:Dai, Yuhan Zhang, Zhirui Du, Yichao Liu, Shengcai Liu, Lemao Xu, Tong 

作者机构:Univ Sci & Technol China Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Anhui Peoples R China Tencent AI Lab Shenzhen 518071 Guangdong Peoples R China Agcy Sci Technol & Res Ctr Frontier AI Res Singapore 138632 Singapore 

出 版 物:《IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING》 (IEEE ACM Trans. Audio Speech Lang. Process.)

年 卷 期:2024年第32卷

页      面:807-817页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 0702[理学-物理学] 

基  金:National Natural Science Foundation of China 

主  题:Nearest neighbor machine translation datastore distillation 

摘      要:Nearest neighbor machine translation (i.e., kNN-MT) is a promising approach to enhance translation quality by equipping pre-trained neural machine translation (NMT) models with the nearest neighbor retrieval. Despite its great success, kNN-MT typically requires ample space to store its token-level datastore, causing kNN-MT to be less practical in edge devices or online scenarios. In this paper, inspired by the concept of knowledge distillation, we provide a new perspective to ease the storage overhead by datastore distillation, which is formalized as a constrained optimization problem. We further design a novel model-agnostic iterative nearest neighbor merging method for the datastore distillation problem to obtain an effective and efficient solution. Experiments on three benchmark datasets indicate that our approach not only reduces the volume of the datastore by up to 50% without significant performance degradation, but also outperforms other baselines by a large margin at the same compression rate. Another experiment conducted on WikiText-103 further demonstrates the effectiveness of our method in the language model task.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分