Forward decoding kernel machines (FDKM) combine large-margin classifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned o...
详细信息
ISBN:
(纸本)0262025507
Forward decoding kernel machines (FDKM) combine large-margin classifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned on observed data using a kernel-based probability model trained with a recursive scheme that deals effectively with noisy and partially labeled data. Training over very large datasets is accomplished using a sparse probabilistic support vector machine (SVM) model based on quadratic entropy, and an on-line stochastic steepest descent algorithm. For speaker-independent continuous phone recognition, FDKM trained over 177, 080 samples of the TIMIT database achieves 80.6% recognition accuracy over the full test set, without use of a prior phonetic language model.
The multiple-pronunciation lexicon (MPL) is very important to model the pronunciation variations for spontaneous speech recognition. But the introduction of MPL brings out two problems. First, the MPL will increase th...
详细信息
The multiple-pronunciation lexicon (MPL) is very important to model the pronunciation variations for spontaneous speech recognition. But the introduction of MPL brings out two problems. First, the MPL will increase the among-lexicon confusion and degrade the recognizer's performance. Second, the MPL needs more data with phonetic transcription so as to cover as many surface forms as possible. Accordingly, two solutions are proposed, they are the context-dependent weighting method and the iterative forced-alignment based transcription method. The use of them can compensate what the MPL causes and improve the overall performance. Experiments across a naturally spontaneous speech database show that the proposed methods are effective and better than other methods.
We present a simplified derivation of the extended Baum-Welch procedure, which shows that it can be used for Maximum Mutual Information (MMI) of a large class of continuous emission density hidden Markov models (HMMs)...
详细信息
ISBN:
(纸本)8790834100
We present a simplified derivation of the extended Baum-Welch procedure, which shows that it can be used for Maximum Mutual Information (MMI) of a large class of continuous emission density hidden Markov models (HMMs). We use the extended Baum-Welch procedure for discriminative estimation of MLLR-Type speaker adaptation transformations. The resulting adaptation procedure, termed Conditional Maximum Likelihood Linear Regression (CMLLR), is used successfully for supervised and unsupervised adaptation tasks on the Switchboard corpus, yielding an improvement over MLLR. The interaction of unsupervised CMLLR with segmental minimum Bayes risk lattice voting procedures is also explored, showing that the two procedures are complimentary.
Conventional wisdom says that incorporating more training data is the surest way to reduce the error rate of a speech recognition system. This, in turn, guarantees that speech recognition systems are expensive to trai...
详细信息
Conventional wisdom says that incorporating more training data is the surest way to reduce the error rate of a speech recognition system. This, in turn, guarantees that speech recognition systems are expensive to train, because of the high cost of annotating training data. We propose an iterative training algorithm that seeks to improve the error rate of a speech recognizer without incurring additional transcription cost, by selecting a subset of the already available transcribed training data. We apply the proposed algorithm to an alpha-digit recognition problem and reduce the error rate from 10.3% to 9.4% on a particular test set.
A hybrid support vector machine (SVM) and hidden Markov model (HMM) approach is proposed for designing continuous speech recognition systems. Using novel properties of SVMs and combining them with HMMs one can obtain ...
详细信息
A hybrid support vector machine (SVM) and hidden Markov model (HMM) approach is proposed for designing continuous speech recognition systems. Using novel properties of SVMs and combining them with HMMs one can obtain models that map easily to hardware and lead to more modular and scalable design. The overall architecture of the proposed system is based on the MAP (maximum a posteriori) framework which offers a direct, feedforward recognition model. The SVMs generate smooth estimates of local transition probabilities in the HMM, conditioned on the acoustic inputs. The transition probabilities are then used to estimate the global posterior probabilities of HMM state sequences. A parallel architecture that implements a simple speech recognition model in real-time is presented.
The problem of blindly separating signal mixtures with fewer mixture components than independent signal sources is mathematically ill-defined, and requires suitable prior information on the nature of the sources. Rece...
详细信息
The problem of blindly separating signal mixtures with fewer mixture components than independent signal sources is mathematically ill-defined, and requires suitable prior information on the nature of the sources. Recently, it has been shown that sparse methods for function approximation using a Laplacian prior can be effective, but the method fails to separate a single mixture without further prior information. Other techniques track harmonics, but assume separability in the time-frequency domain. We show that a measure of temporal and spectral coherence provides an effective cue for separating independent acoustical or sonar sources, in the absence of spatial cues in the monaural case. The technique is shown to successfully separate single mixtures of sources with significant spectral overlap.
Gopalakrishnan et al [1] described a method called "growth transform" to optimize rational functions over a domain, which has been found useful to train discriminatively Hidden Markov Models(HMM) in speech r...
详细信息
This paper describes a simulator for the Shiva multiprocessor system and the simulator construction methodology (SCM) used in its creation. The SCM, based on the active functional unit (AFU) construct, is a modern SCM...
详细信息
This paper describes a simulator for the Shiva multiprocessor system and the simulator construction methodology (SCM) used in its creation. The SCM, based on the active functional unit (AFU) construct, is a modern SCM...
详细信息
This paper describes a simulator for the Shiva multiprocessor system and the simulator construction methodology (SCM) used in its creation. The SCM, based on the active functional unit (AFU) construct, is a modern SCM which is flexible, accurate, fast, easy to use, capable of dynamic reconfigurability at run-time, and most of all simple and capable of quick simulator construction. The AFU SCM is capable of all these things through the use of object-oriented software techniques. The Shiva simulator constructed using the AFU SCM is program-driven and capable of micro and macro architectural simulation.
We present a novel low-power circuit for level crossing interval measurements on continuous-time auditory signals, as obtained from the outputs of an analog cochlear filter bank. The circuit achieves immediate respons...
详细信息
We present a novel low-power circuit for level crossing interval measurements on continuous-time auditory signals, as obtained from the outputs of an analog cochlear filter bank. The circuit achieves immediate response for a given signal in the audio frequency range. Experimental results from a fabricated array of 9 level crossing transducers demonstrate frequency-to-voltage conversion over a range covering the audio band, with less than 120 /spl mu/W of power dissipated per cell from a 5 V supply.
暂无评论