Human agents in technical customer support provide users with instructional answers to solve a task. Developing a technical support question answering (QA) system is challenging due to the broad variety of user intent...
详细信息
ISBN:
(纸本)9783319994956;9783319994949
Human agents in technical customer support provide users with instructional answers to solve a task. Developing a technical support question answering (QA) system is challenging due to the broad variety of user intents. Moreover, user questions are noisy (for example, spelling mistakes), redundant and have various naturallanguage expresses, which are challenges for QA system to match user queries to corresponding standard QA pair. In this work, we combine question intent categories classification and semantic matching model to filter and select correct answers from a back-end knowledge base. Using a real world user chat-log dataset with 60 intent categories, we observe that while supervised models, perform well on the individual classification tasks. For semantic matching, we add muti-info (answer and product information) into standard question and emphasize context information of user query (captured by GRU) into our model. Experiment results indicate that neural multi-perspective sentence similarity networks outperform baseline models. The precision of semantic matching model is 85%.
In this paper, we give an overview for the shared task at the 5th ccfconference on naturallanguageprocessing & chinesecomputing (nlpcc 2016): chinese word segmentation for micro-blog texts. Different with the ...
详细信息
ISBN:
(纸本)9783319504964;9783319504957
In this paper, we give an overview for the shared task at the 5th ccfconference on naturallanguageprocessing & chinesecomputing (nlpcc 2016): chinese word segmentation for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. Besides, we also use a new psychometric-inspired evaluation metric for chinese word segmentation, which addresses to balance the very skewed word distribution at different levels of difficulty. The data and evaluation codes can be downloaded from https://***/FudanNLP/nlpcc-WordSeg-Weibo.
Word embeddings play a significant role in many modern NLP systems. Since learning one representation per word is problematic for polysemous words and homonymous words, researchers propose to use one embedding per wor...
详细信息
ISBN:
(纸本)9783319504964;9783319504957
Word embeddings play a significant role in many modern NLP systems. Since learning one representation per word is problematic for polysemous words and homonymous words, researchers propose to use one embedding per word sense. Their approaches mainly train word sense embeddings on a corpus. In this paper, we propose to use word sense definitions to learn one embedding per word sense. Experimental results on word similarity tasks and a word sense disambiguation task show that word sense embeddings produced by our approach are of high quality.
In this paper, we give an overview for the shared task at the 4th ccfconference on naturallanguageprocessing & chinesecomputing (nlpcc 2015): chinese word segmentation and part-of-speech (POS) tagging for micr...
详细信息
ISBN:
(纸本)9783319252070;9783319252063
In this paper, we give an overview for the shared task at the 4th ccfconference on naturallanguageprocessing & chinesecomputing (nlpcc 2015): chinese word segmentation and part-of-speech (POS) tagging for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. The shared task has two sub-tasks: (1) individual chinese word segmentation and (2) joint chinese word segmentation and POS Tagging. Each subtask has three tracks to distinguish the systems with different resources. We first introduce the dataset and task, then we characterize the different approaches of the participating systems, report the test results, and provide a overview analysis of these results. An online system is available for open registration and evaluation at http://***/nlpcc2015.
To tackle the sparse data problem of the bag-of-words model for document representation, the Context Vector Model (CVM) has been proposed to enrich a document with the relatedness of all the words in a corpus to the d...
详细信息
ISBN:
(纸本)9783319252070;9783319252063
To tackle the sparse data problem of the bag-of-words model for document representation, the Context Vector Model (CVM) has been proposed to enrich a document with the relatedness of all the words in a corpus to the document. The nature of CVM is the combination of word vectors, wherefore the representation method for words is essential for CVM. A computational study is performed in this paper to compare the effects of the newly proposed word representation methods embedded in CVM. The experimental results demonstrate that some of the newly proposed word representation methods significantly improve the performance of CVM, for they estimate the relatedness between words better.
This book constitutes the refereed proceedings of the Thirdccfconference, nlpcc 2014, held in Shenzhen, China, in December 2014. The 35 revised full papers presented together with 8 short papers were carefully revie...
ISBN:
(数字)9783662459249
ISBN:
(纸本)9783662459232;9783662459249
This book constitutes the refereed proceedings of the Thirdccfconference, nlpcc 2014, held in Shenzhen, China, in December 2014. The 35 revised full papers presented together with 8 short papers were carefully reviewed and selected from 110 English submissions. The papers are organized in topical sections on fundamentals on languagecomputing; applications on languagecomputing; machine translation and multi-lingual information access; machine learning for NLP; NLP for social media; NLP for search technology and ads; question answering and user interaction; web mining and information extraction.
chinese comma disambiguation plays key role in many naturallanguageprocessing (NLP) tasks. This paper proposes a joint approach combining K-best parse trees to chinese comma disambiguation to reduce the dependent on...
详细信息
Microblog has provided a convenient and instant platform for information publication and acquisition. Microblog’s short, noisy, real-time features make chinese Microblog entity linking task a new challenge. In this p...
详细信息
Negation and speculation are common in naturallanguage text. Many applications, such as biomedical text mining and clinical information extraction, seek to distinguish positive/factual objects from negative/speculati...
详细信息
暂无评论