Classifier combination is an effective and broadly useful method of improving system performance. This article investigates in depth a large number of both well-established and novel classifier combination approaches ...
This paper presents a method for bootstrapping a fine-grained, broad-coverage part-of-speech (POS) tagger in a new language using only one person-day of data acquisition effort. It requires only three resources, which...
详细信息
This paper presents a method for inducing translation lexicons between two distant languages without the need for either parallel bilingual corpora or a direct bilingual seed dictionary. The algorithm successfully com...
详细信息
This paper investigates the use of a language independent model for named entity recognition based on iterative learning in a co-training fashion, using word-internal and contextual information as independent evidence...
The current study examines the interaction of syllable tones and vowel quantity in the production and perception of monosyllabic words of Thai. A speech corpus containing groups of words differing only as to tone type...
The current study examines the interaction of syllable tones and vowel quantity in the production and perception of monosyllabic words of Thai. A speech corpus containing groups of words differing only as to tone type and vowel quantity was designed. These were embedded in a short carrier sentence of five mid tone syllables, with the target word being the center syllable. The utterances were analyzed with respect to the tonal and segmental features of the target words and F0 contours modeled using the Fujisaki model. Analysis shows that all mid tone sequences can be modeled using the phrase component only whereas the remaining tones require either single tone commands of positive or negative polarity, or a command pair. Based on the analysis results, a perception experiment was designed to explore the perceptual space between words of tone/vowel quantity contrasts. Results indicate, inter alia, that vowel quantity is perceived as shorter when words are presented in isolation than when embedded in a carrier sentence. Confusions generally occur more frequently between words of different vowel quantity than of different tones.
The multiple-pronunciation lexicon (MPL) is very important to model the pronunciation variations for spontaneous speech recognition. But the introduction of MPL brings out two problems. First, the MPL will increase th...
详细信息
The multiple-pronunciation lexicon (MPL) is very important to model the pronunciation variations for spontaneous speech recognition. But the introduction of MPL brings out two problems. First, the MPL will increase the among-lexicon confusion and degrade the recognizer's performance. Second, the MPL needs more data with phonetic transcription so as to cover as many surface forms as possible. Accordingly, two solutions are proposed, they are the context-dependent weighting method and the iterative forced-alignment based transcription method. The use of them can compensate what the MPL causes and improve the overall performance. Experiments across a naturally spontaneous speech database show that the proposed methods are effective and better than other methods.
A forward decoding approach to kernel machine learning is presented. The method combines concepts from Markovian dynamics, large margin classifiers and reproducing kernels for robust sequence detection by learning int...
详细信息
A forward decoding approach to kernel machine learning is presented. The method combines concepts from Markovian dynamics, large margin classifiers and reproducing kernels for robust sequence detection by learning inter-data dependencies. A MAP (maximum a posteriori) sequence estimator is obtained by regressing transition probabilities between symbols as a function of received data. The training procedure involves maximizing a lower bound of a regularized cross-entropy on the posterior probabilities, which simplifies into direct estimation of transition probabilities using kernel logistic regression. Applied to channel equalization, forward decoding kernel machines outperform support vector machines and other techniques by about 5dB in SNR for given BER, within 1 dB of theoretical limits.
Forward decoding kernel machines (FDKM) combine large-margin classifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned o...
详细信息
ISBN:
(纸本)0262025507
Forward decoding kernel machines (FDKM) combine large-margin classifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned on observed data using a kernel-based probability model trained with a recursive scheme that deals effectively with noisy and partially labeled data. Training over very large datasets is accomplished using a sparse probabilistic support vector machine (SVM) model based on quadratic entropy, and an on-line stochastic steepest descent algorithm. For speaker-independent continuous phone recognition, FDKM trained over 177, 080 samples of the TIMIT database achieves 80.6% recognition accuracy over the full test set, without use of a prior phonetic language model.
Tone recognition is a critical component for speech recognition in a tone language. One of the main problems of tone recognition in continuous speech is that several interacting factors affect F0 realization of tones....
详细信息
Tone recognition is a critical component for speech recognition in a tone language. One of the main problems of tone recognition in continuous speech is that several interacting factors affect F0 realization of tones. In this paper, we focus on the coarticulatory, intonation, and stress effects. These effects are compensated by the tone information of neighboring syllables, the adjustment of F0 heights and the stress acoustic features, respectively. The experiments, which compare all tone features, were conducted by feedforward neural networks. The highest recognition rates are improved from 84.07% to 93.60% and 82.48% to 92.67% for Thai proper name and Thai animal story corpora, respectively.
暂无评论