版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
丛 书 名:Synthesis Lectures on Human Language Technologies
I S B N:(纸本) 9781681730639
出 版 社:Morgan & Claypool Publishers
出 版 年:2019年
主 题 词:information science \u0026 technology\/computer science\/\/information science \u0026 technology\/computer science\/\/language\/linguistics
学科分类:08[工学] 081203[工学-计算机应用技术] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
摘 要:The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano nd most other languages remains limited. Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages. In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic.