Negation and speculation are common in naturallanguage text. Many applications, such as biomedical text mining and clinical information extraction, seek to distinguish positive/factual objects from negative/speculati...
详细信息
Post-editing has been successfully applied to correct the output of MT systems to generate better translation, but as a downstream task its positive feedback to MT has not been well studied. In this paper, we present ...
详细信息
Question clustering plays an important role in QA systems. Due to data sparseness and lexical gap in questions, there is no sufficient information to guarantee good clustering results. Besides, previous works pay litt...
详细信息
To tackle the sparse data problem of the bag-of-words model for document representation, the Context Vector Model (CVM) has been proposed to enrich a document with the relatedness of all the words in a corpus to the d...
详细信息
ISBN:
(纸本)9783319252070;9783319252063
To tackle the sparse data problem of the bag-of-words model for document representation, the Context Vector Model (CVM) has been proposed to enrich a document with the relatedness of all the words in a corpus to the document. The nature of CVM is the combination of word vectors, wherefore the representation method for words is essential for CVM. A computational study is performed in this paper to compare the effects of the newly proposed word representation methods embedded in CVM. The experimental results demonstrate that some of the newly proposed word representation methods significantly improve the performance of CVM, for they estimate the relatedness between words better.
Opinion summarization on conversations aims to generate a sentimental summary for a dialogue and is shown to be much more challenging than traditional topic-based summarization and general opinion summarization, due t...
详细信息
Corpus is an essential resource for data driven naturallanguageprocessing systems, especially for sentiment analysis. In recent years, people increasingly use emoticons on social media to express their emotions, att...
详细信息
ISBN:
(纸本)9783319995014;9783319995007
Corpus is an essential resource for data driven naturallanguageprocessing systems, especially for sentiment analysis. In recent years, people increasingly use emoticons on social media to express their emotions, attitudes or preferences. We believe that emoticons are a non-negligible feature of sentiment analysis tasks. However, few existing works focused on sentiment analysis with emoticons. And there are few related corpora with emoticons. In this paper, we create a large scale chinese Emoticon Sentiment Corpus of Movies (CESCM). Different to other corpora, there are a wide variety of emoticons in this corpus. In addition, we did some baseline sentiment analysis work on CESCM. Experimental results show that emoticons do play an important role in sentiment analysis. Our goal is to make the corpus widely available, and we believe that it will offer great support to sentiment analysis research and emoticon research.
Recently, recurrent neural networks (RNNs) have been increasingly used for chinese word segmentation to model the contextual information without the limit of context window. In practice, two kinds of gated RNNs, long ...
详细信息
ISBN:
(纸本)9783319736181;9783319736174
Recently, recurrent neural networks (RNNs) have been increasingly used for chinese word segmentation to model the contextual information without the limit of context window. In practice, two kinds of gated RNNs, long short-term memory (LSTM) and gated recurrent unit (GRU), are often used to alleviate the long dependency problem. In this paper, we propose the hyper-gated recurrent neural networks for chinese word segmentation, which enhance the gates to incorporate the historical information of gates. Experiments on the benchmark datasets show that our model outperforms the baseline models as well as the state-of-the-art methods.
We present the study of sentiment classification of chinese contrast sentences in this paper, which are one of the commonly used language constructs in text. In a typical review, there are at least around 6% of such s...
详细信息
Numerous machine learning tasks achieved substantial advances with the help of large-scale supervised learning corpora over past decade. However, there39;s no large-scale question-answer corpora available for Chines...
详细信息
ISBN:
(纸本)9783319736181;9783319736174
Numerous machine learning tasks achieved substantial advances with the help of large-scale supervised learning corpora over past decade. However, there's no large-scale question-answer corpora available for chinese question answering over knowledge bases. In this paper, we present a 28M chinese Q&A corpora based on the chinese knowledge base provided by nlpcc2017 KBQA challenge. We propose a novel neural network architecture which combines template-based method and seq2seq learning to generate highly fluent and diverse questions. Both automatic and human evaluation results show that our model achieves outstanding performance (76.8 BLEU and 43.1 ROUGE). We also propose a new statistical metric called DIVERSE to measure the linguistic diversity of generated questions and prove that our model can generate much more diverse questions compared with other baselines.
Emotion Recognition in Conversations (ERC) is the task of identifying the emotions of utterances from speakers in a conversation, which is beneficial to a number of applications, including opinion mining over conversa...
详细信息
ISBN:
(纸本)9783031171208;9783031171192
Emotion Recognition in Conversations (ERC) is the task of identifying the emotions of utterances from speakers in a conversation, which is beneficial to a number of applications, including opinion mining over conversations, developing empathetic dialogue systems, and so on. Many approaches have been proposed to handle this problem in recent years. However, most existing approaches either focus on using RNN-based models to simulate temporal information change in the conversation or graph-based models to take the relationships between the utterances of the speakers into account. In this paper, we propose a temporal and relational graph attention network, named DialogueTRGAT, to combine the strengths of RNN-based models and graph-based models. DialogueTRGAT can better model the intrinsic structure and information flow within a conversation for better emotion recognition. We conduct experiments on two benchmark datasets(IEMOCAP, MELD), and the experimental results demonstrate the great effectiveness of our approach compared with several competitive baselines.
暂无评论