As a powerful method modeling stochasticprocesses,HMM is being used successfully inspeech recognition tasks,on condition of anefficient strategy of parameters *** paradigm of the parameter-estimationmethods is Baum we...
详细信息
As a powerful method modeling stochasticprocesses,HMM is being used successfully inspeech recognition tasks,on condition of anefficient strategy of parameters *** paradigm of the parameter-estimationmethods is Baum welch algorithm,based on its firmfoundation of E-M non-supervisor parameterestimation procedure Nevertheless its glory Successin many real systems,Baum welth algorithm revealsits inherent limit:a problem of calculation-overflow,serious sometimes,especially with the emergence ofLarge Vocabulary Continuous Speech Recognition(LVCSR)system,or when data-sparseness have tobe *** paper analyses the problem ofcalculation-overflow in certain new aspects,following by two kinds of robust overflow-resistantstrategies adopted in our gallina system,Subaumelch algorithm and Log-SwappedForward-Backward algorithm,compared in respectof efficiency and accurracy,and evaluated by our testcorpus-863CSL.
Human speech mechanisms are twofold:speech productionand speech *** paper introduces two studiescarried out based on the production and the *** of them is a method to estimatearticulatory targets from speech sounds vi...
详细信息
Human speech mechanisms are twofold:speech productionand speech *** paper introduces two studiescarried out based on the production and the *** of them is a method to estimatearticulatory targets from speech sounds via a physiologicalarticulatory *** potentially clarities certainproblems with the current speech recognition *** estimation,acoustical parameters arc considered as afunction of the articulatory targets for model *** location is estimated based on a comparison ofacoustical parameters between real speech sound andsynthetic sound corresponding to the (?)*** proposedestimation method was evaluated using *** suggests that our physiological articulatorymodel can be a valuable tool tbr the inverse *** model included in this paper is built based onknowledge about human psychoacoustics and auditoryphysiology to enhance speech by detecting and thencanceling *** attention is paid to reducing noiseby using a spatial filtering *** technique adoptsconcepts of the cancellation *** results showthat the spatial filtering is useful in enhancing *** *** filtering method can be used effectively at thefront-end of automatic speech recognition systems.
This paper focuses on the challenges that are faced whendesigning a multi-lingual speaker-independent speechrecognition *** order to improve the usability ofvoice dialing in mobile terminals,speaker-independenttechnol...
详细信息
This paper focuses on the challenges that are faced whendesigning a multi-lingual speaker-independent speechrecognition *** order to improve the usability ofvoice dialing in mobile terminals,speaker-independenttechnology is preferred to speaker-dependent *** advantages of speaker-dependent speech *** as low complexity and languageindependency,are to be preserved after this *** to *** nature of the marketsand the tuture applications,speaker independence impliesthe development and use of language-independent ASR toavoid logistic *** propose here anarchitecture for embedded multilingual speechrecognition *** acoustic modeling,automatic language *** on-linepronunciation modeling are the key features which enablethe creation of truly language-and speaker-independentASR applications with dynamic vocabularies and sparseimplementation *** experimental results witha multi-lingual speech recognizer for several Europeanlanguages and Mandarin Chinese confirm the viability ofthe proposed approach and *** the use ofmultilingual acoustic models degrades the recognitionrates only marginally.a recognition accuracy decrease ofapproximately 4% is observed due to sub-optimal on-linetext-to-phoneme mapping and automatic *** performance loss can nevertheless becompensated by applying acoustic model adaptationtechniques.
ISIS, which abbreviates Intelligent Speech for InformationSystems, is a trilingual spoken dialog system (SDS) for thefinancial *** handles two dialects of Chinese(Cantonese and Putonghua), as well as English—thepredo...
详细信息
ISIS, which abbreviates Intelligent Speech for InformationSystems, is a trilingual spoken dialog system (SDS) for thefinancial *** handles two dialects of Chinese(Cantonese and Putonghua), as well as English—thepredominant languages in our region. The system supportsspoken language queries regarding stock market informationand simulated personal portfolios. Real-time information isretrieved directly from a dedicated Reuters satellite *** a system test-bed for our work in multilingualspeech recognition and generation, speader authentication,language understanding and dialog modeling. Furthermore,ISIS supports our new explorations in: (i) CORBA‘sinteroperability and scalability for SDS development, (ii)asynchronous human-computer interaction by delegation toKQML software agents;(iii) switching between onlineinteraction and offline delegation in a single dialog thread,and (iv) automatic incorporation of newly listed stocks toexpand our system’s knowledge base.
暂无评论