Document Object Model(DOM)is widely used fordynamic description of eXtensible Markup Language(XML)document and its *** is anXML instance targeting voice interaction for enterpriselevel telephony applications.A design ...
详细信息
Document Object Model(DOM)is widely used fordynamic description of eXtensible Markup Language(XML)document and its *** is anXML instance targeting voice interaction for enterpriselevel telephony applications.A design on DOM extensionfor VoiceXML is thus introduced in this paper to providea standard way on VoiceXML manipulation and build anexposing mechanism for VoiceXML *** module and event module are presented indetail.
Capitalizing on the short-timestationarity of speech signal, an estimation ofthe background noise parameters in noisyspeech signals is derived. The novel approach,requiring no active/silent frame detection tothe noisy...
详细信息
Capitalizing on the short-timestationarity of speech signal, an estimation ofthe background noise parameters in noisyspeech signals is derived. The novel approach,requiring no active/silent frame detection tothe noisy speech, can be computed effieientlyand give real-time estimation of the *** results can be achieved even if thebackground noise has slowly time varyingfeature, so the speech enhancement effect *** Terms—speech enhancement, noiseestimation, short-time spectral amplitude,spectral subtraction estimator
A segmental VQ based efficient speech recognitionmethod is introduced in this *** method takesadvantage of the “initial consonant-final”structure ofChinese *** has less *** consumption but a high ARR(accuraterecogni...
详细信息
A segmental VQ based efficient speech recognitionmethod is introduced in this *** method takesadvantage of the “initial consonant-final”structure ofChinese *** has less *** consumption but a high ARR(accuraterecognition rate)compared with traditional HMM(hiddenMarkov model)or NN(neural network)***-scale test on the task of 11 Chinese digits recognitionshows that the WER(word error rate)can reach 1.91% inspeaker-dependent test and 11.69% in speaker-independenttest,which shows it’s more suitable for *** it has potential to be extensively appliedin monosyllable recognition.
Dealing with polyphones is an important part of Chinesetext-to-speech *** the pronunciation of aChinese character is directly related to the meaning of it,an algorithm based on semantic calculation using How-Netis int...
详细信息
Dealing with polyphones is an important part of Chinesetext-to-speech *** the pronunciation of aChinese character is directly related to the meaning of it,an algorithm based on semantic calculation using How-Netis introduced to determine the pronunciations of thepolyphones in new words,which hasn’t appeared in thepolyphone list,the polyphone knowledge base or *** experiment results prove it can do goodperformance.
Sub-syllables are the popular speech units in Mandarinspeech recognition. Since there are several confusion setsin Mandarin sub-syllables,it is hard to obtain highrccognition accuracy in acoustic *** this paper weprop...
详细信息
Sub-syllables are the popular speech units in Mandarinspeech recognition. Since there are several confusion setsin Mandarin sub-syllables,it is hard to obtain highrccognition accuracy in acoustic *** this paper wepropose a method of utterance verificaton to improve therecognition performance. The basic idea is to calculate thenormlalized log-likelihood score for each speech unit, and athreshold value for a specific speech unit is determinedthrough a training *** the decision to accept orreject a detected speech unit depends on this *** Mandarin sub-syllable recognition, Mel-scale cepstralcoefficients and log energy calculated for every frame arethe speech features for this task. The result of experimentsdemonstrates the effectiveness of our proposed method.
In this paper,we perform the speech enhancement based onapproximate Karhunen-Loeve transform. The signal isrepresented by using wavelet packet based on a basis *** eigenvectors are firit evaluated from these bases,the...
详细信息
In this paper,we perform the speech enhancement based onapproximate Karhunen-Loeve transform. The signal isrepresented by using wavelet packet based on a basis *** eigenvectors are firit evaluated from these bases,then a linear estimator based on the eigenvectors is constructedand used to perform noise *** evaluate theperformance of this method by using the Aurora-2 database. TheSNR improvement is calculated. Some waveforms andspectrograms of euhanced speech are also shown. Finally, theenhanced speech is tested for speech recognition. Theseexperimental resuls show that this method achieves satisfactoryenhancement of speech.
The design of spoken dialogue systems is more of an artthan of science or engineering,*** design of thecontrol component—dialogue *** problemsof usability and portabiity still *** is partlybecause there are gaps betw...
详细信息
The design of spoken dialogue systems is more of an artthan of science or engineering,*** design of thecontrol component—dialogue *** problemsof usability and portabiity still *** is partlybecause there are gaps between dialogue modeling anddialogue *** analzing presen dialoguemodels and *** models,we propose aganeric dialogue model for dialogue management in task-oricnted (information-seeking)spoken dialgue systems,which combines both interaction patterns and task *** accounts for both statics and dynamics in inrormation-seeking dialogues and promises to bridge the gaps.
*** 1978,when Panasonic started speechtechnology R&*** Panasonic speech technology grouphave been working to realize user-friendly man-machineinterface with speech technologies including ASR,TTSand spoken dialogue...
详细信息
*** 1978,when Panasonic started speechtechnology R&*** Panasonic speech technology grouphave been working to realize user-friendly man-machineinterface with speech technologies including ASR,TTSand spoken dialogue *** then till today,we have focused our R&D activities on bringing speechtechnology merits to our customers through various kindsof Panasonic consumer electronics products for 23 *** this *** mainly highlight our R&Dactivities of ASR *** following section 2,wefocus strategies and results of our past R&D *** 3,we mention our future vision of speechtechnology R&D for the new century.
The pronunciation variability is an important issuethat must be faced with when developing practicalautomatic spontaneous speech recognition *** this paper, the factors that may affect therecognition performance are a...
详细信息
The pronunciation variability is an important issuethat must be faced with when developing practicalautomatic spontaneous speech recognition *** this paper, the factors that may affect therecognition performance are analyzed, including thosespecific to the Chinese language. By studying theINTTIAI/FINAL.(IF) characteristics of Chineselanguage and developing the Bayesian equation, wepropose the concepts of generalized INITIAI/FINAL(GIF) and generalized syllable(GS),the GIF modelingand the IF-GIF modeling, as well as thecontext-dependent pronunciation weighting, basedon a well phonetically transcribed seed database. Byusing these methods, the Chinese syllable error rate(SFR) was reduced by 6.3% and 4.2% compared withthe GIF modeling and IF modeling respectively whenthe language model, such as syllable or word N-gram,is not used. The effectiveness of these methodsis alsoproved when more data without the phonetictranseription is used to refine the acoustic modelusing the proposed iterative forced-alignment basedtranscribing (IFABT) method, achieving a 5.7% SERreduction.
According to the Hong Kong Tourist Association, thenumber of tourists from mainland is increasing rapidly after1997. Mandarin becomes a very important language in HongKong. Therefore, there is a need to create a machi...
详细信息
According to the Hong Kong Tourist Association, thenumber of tourists from mainland is increasing rapidly after1997. Mandarin becomes a very important language in HongKong. Therefore, there is a need to create a machine to translateMandarin to Cantonese. In this paper, a speech-to-speechtranslation system from Mandarin to Cantonese,for a domainspecific application,Tourist Information Inquiry,will beintroduced. A Mandarin Speech Recognizer and a CantoneseSpeech Synthesizer have been implemented within the specificdomain. The rules of Mandarin text to Cantonese textconversion have been developed.A Mandarin-to-CantoneseDictionary is *** Model and Prosodic features areincorporated to improve recognition performance and speechquality.
暂无评论