We report an empirical study on the role of syntactic features in building a semisupervised named entity (NE) tagger. Our study addresses two questions: What types of syntactic features are suitable for extracting pot...
详细信息
This paper presents a new approach to phrase-level sentiment analysis that first determines whether an expression is neutral or polar and then disambiguates the polarity of the polar expressions. With this approach, t...
详细信息
Wediscuss the problems of developinga theory of context that applies to the phenomena of naturally occurring discourse and that might be useful in assisting with the problemsthat arise in generating and interpreting d...
Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from th...
详细信息
Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from the curse of dimensionality. Motivated by other machine learning models for which the existence of structure in the underlying distribution of data can reduce the complexity of estimation, we study and show how the factorization of the underlying distribution into independent subspaces can help us to estimate diffusion maps more accurately. Building upon this result, we propose and develop an algorithm that can automatically factorize a high dimensional data space in order to minimize the error of estimation of its diffusion map, even in the case when the underlying distribution is not decomposable. Experiments on both the synthetic and realworld datasets demonstrate improved estimation performance of our method over the standard diffusion-map framework.
This paper describes extensions to a corpus annotation scheme for the manual annotation of attributions, as well as opinions, emotions, sentiments, speculations, evaluations and other private states in language. It di...
详细信息
Until recently, techniques for AI plan generation relied on highly restrictive assumptions that were almost always violated in real-world environments;consequently, robot designers adopted reactive architectures and a...
详细信息
This paper applies the categories from an opinion annotation scheme developed for monologue text to the genre of multiparty meetings. We describe modifications to the coding guidelines that were required to extend the...
This paper studies the impact that difficult-to-translate source-language phrases might have on the machine translation process. We formulate the notion of difficulty as a measurable quantity;we show that a classifier...
详细信息
This paper investigates the idea of adapting language models for phrases that have poor translation quality. We apply a selective adaptation criterion which uses a classifier to locate the most difficult phrase of eac...
详细信息
暂无评论