作者:
Lo, Chi-KiuWu, DekaiHKUST
Human Language Technology Center Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong
We present an unsupervised approach to estimate the appropriate degree of contribution of each semantic role type for semantic translation evaluation, yielding a semantic MT evaluation metric whose correlation with hu...
详细信息
We present a new approach of using Auto-Associative Neural Networks (AANNs) in the conventional GMM speaker verification framework with i-vector feature extraction and PLDA modeling. In this technique, an i-vector fea...
详细信息
We present a description of the development and evaluation of a first South African broadcast news transcription system. We describe a number of speech resources which have been collected in the resource-scarce South ...
详细信息
Full covariance acoustic models trained with limited training data generalize poorly to unseen test data due to a large number of free parameters. We propose to use sparse inverse covariance matrices to address this p...
详细信息
Full covariance acoustic models trained with limited training data generalize poorly to unseen test data due to a large number of free parameters. We propose to use sparse inverse covariance matrices to address this problem. Previous sparse inverse covariance methods never outperformed full covariance methods. We propose a method to automatically drive the structure of inverse covariance matrices to sparse during training. We use a new objective function by adding L1 regularization to the traditional objective function for maximum likelihood estimation. The graphic lasso method for the estimation of a sparse inverse covariance matrix is incorporated into the Expectation Maximization algorithm to learn parameters of HMM using the new objective function. Experimental results show that we only need about 25% of the parameters of the inverse covariance matrices to be nonzero in order to achieve the same performance of a full covariance system. Our proposed system using sparse inverse covariance Gaussians also significantly outperforms a system using full covariance Gaussians trained on limited data.
This paper proposes a first-ever phrase-level transduction model with reordering to transform colloquial speech directly to written-style transcription. This model is capable of performing n-m transductions. Our trans...
详细信息
This paper proposes a first-ever phrase-level transduction model with reordering to transform colloquial speech directly to written-style transcription. This model is capable of performing n-m transductions. Our transduction model is trained from a parallel corpus of verbatim transcription and written-style transcription. Deletions, substitutions, insertions are well represented using this model. Inversion transduction cases can also be identified and represented. We implement our transduction model using weighted finite-state transducers (WFSTs), and integrate it into a WFST-based speech recognition search space to give both verbatim speaking-style and written-style transcriptions. Evaluations of our model on Cantonese speech to standard written Chinese show 11.59% relative Word Error Rate (WER) reduction over interpolated language model between Cantonese and standard Chinese speech, 5.72% relative WER reduction and 14.82% relative Bilingual Evaluation Understudy (BLEU) improvement over the word-level transduction model.
作者:
Bojar, OndřejWu, DekaiCharles University in Prague
Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics Czech Republic HKUST
Human Language Technology Center Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong
HMEANT (Lo and Wu, 2011a) is a manual MT evaluation technique that focuses on predicate-argument structure of the sentence. We relate HMEANT to an established linguistic theory, highlighting the possibilities of reusi...
This paper proposes cross-lingual language modeling for transcribing source resource-poor languages and translating them into target resource-rich languages if necessary. Our focus is to improve the speech recognition...
详细信息
ISBN:
(纸本)9781622765034
This paper proposes cross-lingual language modeling for transcribing source resource-poor languages and translating them into target resource-rich languages if necessary. Our focus is to improve the speech recognition performance of low-resource languages by leveraging the language model statistics from resource-rich languages. The most challenging work of cross-lingual language modeling is to solve the syntactic discrepancies between the source and target languages. We therefore propose syntactic reordering for cross-lingual language modeling, and present a first result that compares inversion transduction grammar (ITG) reordering constraints to IBM and local constraints in an integrated speech transcription and translation system. Evaluations on resource-poor Cantonese speech transcription and Cantonese to resource-rich Mandarin translation tasks show that our proposed approach improves the system performance significantly, up to 3.4% relative WER reduction in Cantonese transcription and 13.3% relative bilingual evaluation understudy (BLEU) score improvement in Mandarin transcription compared with the system without reordering.
We present an unsupervised approach to estimate the appropriate degree of contribution of each semantic role type for semantic translation evaluation, yielding a semantic MT evaluation metric whose correlation with hu...
详细信息
ISBN:
(纸本)9781627483452
We present an unsupervised approach to estimate the appropriate degree of contribution of each semantic role type for semantic translation evaluation, yielding a semantic MT evaluation metric whose correlation with human adequacy judgments is comparable to that of recent supervised approaches but without the high cost of a human-ranked training corpus. Our new unsupervised estimation approach is motivated by an analysis showing that the weights learned from supervised training are distributed in a similar fashion to the relative frequencies of the semantic roles. Empirical results show that even without a training corpus of human adequacy rankings against which to optimize correlation, using instead our relative frequency weighting scheme to approximate the importance of each semantic role type leads to a semantic MT evaluation metric that correlates comparable with human adequacy judgments to previous metrics that require far more expensive human rankings of adequacy over a training corpus. As a result, the cost of semantic MT evaluation is greatly reduced.
作者:
Ondrej BojarDekai WuCharles University in Prague
Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics HKUST
Human Language Technology Center Department of Computer Science and Engineering Hong Kong University of Science and Technology
HMEANT (Lo and Wu, 2011a) is a manual MT evaluation technique that focuses on predicate-argument structure of the sentence. We relate HMEANT to an established linguistic theory, highlighting the possibilities of reusi...
详细信息
ISBN:
(纸本)9781627483452
HMEANT (Lo and Wu, 2011a) is a manual MT evaluation technique that focuses on predicate-argument structure of the sentence. We relate HMEANT to an established linguistic theory, highlighting the possibilities of reusing existing knowledge and resources for interpreting and automating HMEANT. We apply HMEANT to a new language, Czech in particular, by evaluating a set of English-to-Czech MT systems. HMEANT proves to correlate with manual rankings at the sentence . level better than a range of automatic metrics. However, the main contribution of this paper is the identification of several issues of HMEANT annotation and our proposal on how to resolve them.
The recursive least squares (RLS) algorithm is well known and has been widely used for many years. Most analyses of RLS have assumed statistical properties of the data or the noise process, but recent robust ℌ ∞ ana...
详细信息
The recursive least squares (RLS) algorithm is well known and has been widely used for many years. Most analyses of RLS have assumed statistical properties of the data or the noise process, but recent robust ℌ ∞ analyses have been used to bound the ratio of the performance of the algorithm to the total noise. In this paper, we provide an additive analysis bounding the difference between performance and noise. Our analysis provides additional convergence guarantees in general, and particular benefits for structured input data. We illustrate the analysis using human speech and white noise.
暂无评论