In recent work, Lyu and Simoncelli [1] introduced radial Gaussianization (RG) as a very efficient procedure for transforming n-dimensional random vectors into Gaussian vectors with independent and identically distribu...
详细信息
In recent work, Lyu and Simoncelli [1] introduced radial Gaussianization (RG) as a very efficient procedure for transforming n-dimensional random vectors into Gaussian vectors with independent and identically distributed (i.i.d.) components. This entails transforming the norms of the data so that they become chi-distributed with n degrees of freedom. A necessary requirement is that the original data are generated by an isotropic distribution, that is, their probability density function (pdf) is constant over surfaces of n-dimensional spheres (or, more general, n-dimensional ellipsoids). The case of biases in the data, which is of great practical interest, is studied here; as we demonstrate with experiments, there are situations in which even very small amounts of bias can cause RG to fail. This becomes evident especially when the data form clusters in low-dimensional manifolds. To address this shortcoming, we propose a two-step approach which entails (i) first discovering clusters in the data and removing the bias from each, and (ii) performing RG on the bias-compensated data. In experiments with synthetic data, the proposed bias compensation procedure results in significantly better Gaus-sianization than the non-compensated RG method.
Word Sense Disambiguation (WSD) is one of the fundamental natural languageprocessing tasks. However, lack of training corpora is a bottleneck to construct a high accurate all-words WSD system. Annotating a large-scal...
详细信息
This work surveys existing evaluation methodologies for the task of sentence compression, identifies their shortcomings, and proposes alternatives. In particular, we examine the problems of evaluating paraphrastic com...
We describe a new approach for rescoring speech lattices - with long-span language models or wide-context acoustic models - that does not entail computationally intensive lattice expansion or limited rescoring of only...
详细信息
We describe a new approach for rescoring speech lattices - with long-span language models or wide-context acoustic models - that does not entail computationally intensive lattice expansion or limited rescoring of only an N-best list. We view the set of word-sequences in a lattice as a discrete space equipped with the edit-distance metric, and develop a hill climbing technique to start with, say, the 1-best hypothesis under the lattice-generating model(s) and iteratively search a local neighborhood for the highest-scoring hypothesis under the rescoring model(s); such neighborhoods are efficiently constructed via finite state techniques. We demonstrate empirically that to achieve the same reduction in error rate using a better estimated, higher order language model, our technique evaluates fewer utterance-length hypotheses than conventional N-best rescoring by two orders of magnitude. For the same number of hypotheses evaluated, our technique results in a significantly lower error rate.
We propose the use of a nonparametric Bayesian model, the Hierarchical Dirichlet Process (HDP), for the task of Word Sense Induction. Results are shown through comparison against Latent Dirichlet Allocation (LDA), a p...
详细信息
We present a substitution-only approach to sentence compression which "tightens" a sentence by reducing its character length. Replacing phrases with shorter paraphrases yields paraphrastic compressions as sh...
Decision trees have been applied to a variety of NLP tasks, including language modeling, for their ability to handle a variety of attributes and sparse context space. Moreover, forests (collections of decision trees) ...
详细信息
In the face of sparsity, statistical models are often interpolated with lower order (backoff) models, particularly in language Modeling. In this paper, we argue that there is a relation between the higher order and th...
详细信息
Confidence-weighted online learning is a generalization of margin-based learning of linear classifiers in which the margin constraint is replaced by a probabilistic constraint based on a distribution over classifier w...
详细信息
Confidence-weighted online learning is a generalization of margin-based learning of linear classifiers in which the margin constraint is replaced by a probabilistic constraint based on a distribution over classifier weights that is updated online as examples are observed. The distribution captures a notion of confidence on classifier weights, and in some cases it can also be interpreted as replacing a single learning rate by adaptive per-weight rates. Confidence-weighted learning was motivated by the statistical properties of natural-language classification tasks, where most of the informative features are relatively rare. We investigate several versions of confidence-weighted learning that use a Gaussian distribution over weight vectors, updated at each observed example to achieve high probability of correct classification for the example. Empirical evaluation on a range of text-categorization tasks show that our algorithms improve over other state-of-the-art online and batch methods, learn faster in the online setting, and lead to better classifier combination for a type of distributed training commonly used in cloud computing.
Large vocabulary speech recognition systems fail to recognize words beyond their vocabulary, many of which are information rich terms, like named entities or foreign words. Hybrid word/sub-word systems solve this prob...
详细信息
暂无评论