The dysarthric speech characteristics of 14 Thai stroke patients were assessed by the computerized Articulation Test [1]. speech accuracy and error pattern were analyzed. Vowels and tonal characteristics were the most...
详细信息
We address the problem of extracting bilingual chunk pairs from parallel text to create training sets for statistical machine translation. We formulate the problem in terms of a stochastic generative process over text...
详细信息
Summary form only given. We study a simplified version of the problem of target detectability in the presence of clutter. The target (the needle) is a sample of size N from a discrete distribution p. The clutter (the ...
详细信息
Summary form only given. We study a simplified version of the problem of target detectability in the presence of clutter. The target (the needle) is a sample of size N from a discrete distribution p. The clutter (the haystack) is made up of M independent samples of size JV from a distribution q (which is different from p, but with the same support). Two cases can be easily shown: (i) If M is fixed and JV goes to infinity, the target can be detected with probability that approaches 1. (ii) If TV is fixed and M goes to infinity, then, with probability approaching 1, the target cannot be detected. For the case where both JV, M go to infinity, we show that the asymptotic behavior of the optimal detector (if p, q are known) and of a plug-in detector (which estimates p, q on the fly) is determined by the asymptotic behavior of the quantity Mexp(-ND(p\\q)) : if it goes to zero (resp. infinity), then, with high probability, the target can (resp. cannot) be detected.
We demonstrate an original and successful approach for both resolving and generating definite anaphora. We propose and evaluate unsupervised models for extracting hypernym relations by mining cooccurrence data of defi...
We describe our entry in the CoNLL-X shared task. The system consists of three phases: a probabilistic vine parser (Eisner and N. Smith, 2005) that produces unlabeled dependency trees, a probabilistic relation-labelin...
详细信息
We describe our entry in the CoNLL-X shared task. The system consists of three phases: a probabilistic vine parser (Eisner and N. Smith, 2005) that produces unlabeled dependency trees, a probabilistic relation-labeling model, and a discriminative minimum risk reranker (D. Smith and Eisner, 2006). The system is designed for fast training and decoding and for high precision. We describe sources of crosslingual error and ways to ameliorate them. We then provide a detailed error analysis of parses produced for sentences in German (much training data) and Arabic (little training data).
We present and empirically compare a range of novel probabilistic finite-state transducer (PFST) models targeted at two major natural language string transduction tasks, transliteration selection and cognate translati...
详细信息
We describe finite-state constraint relaxation, a method for applying global constraints, expressed as automata, to sequence model decoding. We present algorithms for both hard constraints and binary soft constraints....
详细信息
When training the parameters for a natural language system, one would prefer to minimize 1-best loss (error) on an evaluation set. Since the error surface for many natural language problems is piecewise constant and r...
详细信息
Many syntactic models in machine translation are channels that transform one tree into another, or synchronous grammars that generate trees in parallel. We present a new model of the translation process: quasi-synchro...
详细信息
We present a Weighted Finite State Transducer Translation Template Model for statistical machine translation. This is a source-channel model of translation inspired by the Alignment Template translation model. The mod...
详细信息
暂无评论