版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Sci & Technol China Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Anhui Peoples R China Tencent AI Lab Shenzhen 518071 Guangdong Peoples R China Agcy Sci Technol & Res Ctr Frontier AI Res Singapore 138632 Singapore
出 版 物:《IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING》 (IEEE ACM Trans. Audio Speech Lang. Process.)
年 卷 期:2024年第32卷
页 面:807-817页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0702[理学-物理学]
基 金:National Natural Science Foundation of China
主 题:Nearest neighbor machine translation datastore distillation
摘 要:Nearest neighbor machine translation (i.e., kNN-MT) is a promising approach to enhance translation quality by equipping pre-trained neural machine translation (NMT) models with the nearest neighbor retrieval. Despite its great success, kNN-MT typically requires ample space to store its token-level datastore, causing kNN-MT to be less practical in edge devices or online scenarios. In this paper, inspired by the concept of knowledge distillation, we provide a new perspective to ease the storage overhead by datastore distillation, which is formalized as a constrained optimization problem. We further design a novel model-agnostic iterative nearest neighbor merging method for the datastore distillation problem to obtain an effective and efficient solution. Experiments on three benchmark datasets indicate that our approach not only reduces the volume of the datastore by up to 50% without significant performance degradation, but also outperforms other baselines by a large margin at the same compression rate. Another experiment conducted on WikiText-103 further demonstrates the effectiveness of our method in the language model task.