In this work, we propose two extensions of standard word lexicons in statistical machine translation: A discriminative word lexicon that uses sentence-level source information to predict the target words and a trigger...
详细信息
This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (conversation transcripts, email) and lan...
详细信息
This paper presents six novel approaches to biographic fact extraction that model structural, transitive and latent properties of biographical data. The ensemble of these proposed models substantially outperforms stan...
详细信息
We present results on a novel hybrid semantic SMT model that incorporates the strengths of both semantic role labeling and phrase-based statistical machine translation. The approach avoids major complexity limitations...
详细信息
We present a series of empirical studies aimed at illuminating more precisely the likely contribution of semantic roles in improving statistical machine translation accuracy. The experiments reported study several asp...
详细信息
In this paper, we deal with the problem of a large number of unaligned words in automatically learned word alignments for machine translation (MT). These unaligned words are the reason for ambiguous phrase pairs extra...
详细信息
The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of tech...
详细信息
We present a characterization of a useful class of skills based on a graphical representation of an agent's interaction with its environment. Our characterization uses betweenness, a measure of centrality on graph...
详细信息
ISBN:
(纸本)9781605609492
We present a characterization of a useful class of skills based on a graphical representation of an agent's interaction with its environment. Our characterization uses betweenness, a measure of centrality on graphs. It captures and generalizes (at least intuitively) the bottleneck concept, which has inspired many of the existing skill-discovery algorithms. Our characterization may be used directly to form a set of skills suitable for a given task. More importantly, it serves as a useful guide for developing incremental skill-discovery algorithms that do not rely on knowing or representing the interaction graph in its entirety.
The recently introduced online confidence-weighted (CW) learning algorithm for binary classification performs well on many binary NLP tasks. However, for multi-class problems CW learning updates and inference cannot b...
详细信息
Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, an...
详细信息
暂无评论