In this paper,we describe an efficient algorithm to select phonetically balanced scripts for collecting a large-scale multilingual speech *** is expected to collect a multilingual speech corpus covering three most fre...
详细信息
ISBN:
(纸本)0780379020
In this paper,we describe an efficient algorithm to select phonetically balanced scripts for collecting a large-scale multilingual speech *** is expected to collect a multilingual speech corpus covering three most frequently used languages in Taiwan,including Taiwanese(Min-nan),Hakka,and Mandarin Chinese. To achieve the objective,the first step is to construct a multilingual phonetic alphabet,namely Formosa Phonetic Alphabet(ForPA).In addition,the multilingual lexicons(Fomosa Lexicons)are also important parts for building the *** now, this corpus containing 600 speakers' speech of Taiwanese(Min-nan)and Mandarin Chinese has been finished and ready to *** contains about 40 hours of speech in 247 thousand utterances in this release.
With the continuing development of mobile telecommunications, a mobile phone usually contains multiple functions and features, such as e-mail sending capabilities, text messaging, notebooks, electronic dictionaries, a...
详细信息
ISBN:
(纸本)0780379020
With the continuing development of mobile telecommunications, a mobile phone usually contains multiple functions and features, such as e-mail sending capabilities, text messaging, notebooks, electronic dictionaries, and so on. We concern with the placement of Chinese phonetic symbols for an intellectual Chinese phonetic symbol input system in mobile phones. A good placement of Chinese phonetic symbols can efficiently reduce the typing time when Chinese characters are entered. Applying the genetic algorithm, we can easily find the appropriate placement of phonetic symbols on the keypads of mobile phones. As the experimental results show, our method is not only substantially better than the traditional Chinese phonetic symbol input system but is also better than the general placement of phonetic symbols used in an intellectual Chinese phonetic symbol input system.
Here we study numerically the structure of directed state transition graphs for several types of finite-state devices representing morphology of 16 *** all numerical experiments we have found that the distribution of ...
详细信息
Here we study numerically the structure of directed state transition graphs for several types of finite-state devices representing morphology of 16 *** all numerical experiments we have found that the distribution of incoming and outcoming links is highly skewed and is modeled well by the power law,not by the Poisson distribution typical for classical random *** for tliree languages,distribution of nodes according to the traffic they experience during corpora processing obeys the power law as *** and out-degree are the parameters,which affect performance of finite-state *** discuss how specific properties of power law,like distribution of these parameters(coexistence of small number of "hubs" with large number of "small events"),can be exploited for efficient computer implementation of finite-state devices used in morphology.
<正>In this paper,we present a novel methodology to enhance Chinese text chunking with the aid of transductive Hidden Markov Models(transductive HMMs,henceforth).We consider chunking as a special tagging problem a...
详细信息
<正>In this paper,we present a novel methodology to enhance Chinese text chunking with the aid of transductive Hidden Markov Models(transductive HMMs,henceforth).We consider chunking as a special tagging problem and attempt to utilize,via a number of transformation functions,as much relevant contextual information as possible for model training. These functions enable the models to make use of contextual information to a greater extent and keep us away from costly changes of the original training and tagging *** of them results in an individual model with certain pros and *** a number of experiments,we succeed in integrating the best two models into a significantly better *** carry out the chunking experiments on the HIT Chinese Treebank *** results show that it is an effective approach,achieving an F score of 82.38%.
The Bayesian paradigm provides a natural and effective means of exploiting prior knowledge concerning the time-frequency structure of sound signals such as speech and music-something which has often been overlooked in...
详细信息
ISBN:
(纸本)0262025507
The Bayesian paradigm provides a natural and effective means of exploiting prior knowledge concerning the time-frequency structure of sound signals such as speech and music-something which has often been overlooked in traditional audio signal processing approaches. Here, after constructing a Bayesian model and prior distributions capable of taking into account the time-frequency characteristics of typical audio waveforms, we apply Markov chain Monte Carlo methods in order to sample from the resultant posterior distribution of interest. We present speech enhancement results which compare favourably in objective terms with standard time-varying filtering techniques (and in several cases yield superior performance, both objectively and subjectively);moreover, in contrast to such methods, our results are obtained without an assumption of prior knowledge of the noise power.
In this paper,two methods of construct a Chinese-English bilingual phone inventory are proposed and *** research focuses on a robust,suitable and compact phone combination of the two utterly different *** first method...
详细信息
ISBN:
(纸本)0780379020
In this paper,two methods of construct a Chinese-English bilingual phone inventory are proposed and *** research focuses on a robust,suitable and compact phone combination of the two utterly different *** first method is to combine Chinese phonemes and English phonemes *** can provide the required consistency with the western *** second method is to combine Chinese INITIALS and FINALs(IFs) with English phonemes in the bilingual acoustic *** results show that the first method is more compact and flexible in acoustic modeling than the second *** the performace decrease significantly about 1.9%and 3.8%in Chinese and English test *** the contrary,the second method achieves higher word accuracy than the ***'s performance degrades only 0.3%and 2.2%for two languages,but with more parameters included in acoustic *** issues of building this bilingual speech recognizer are also addressed.
This paper presents our style-specific language model adaptation method for Korean conversational speech *** with the written text corpora, conversational speech shows different characteristics of content and style su...
详细信息
ISBN:
(纸本)0780379020
This paper presents our style-specific language model adaptation method for Korean conversational speech *** with the written text corpora, conversational speech shows different characteristics of content and style such as filled pauses,word omission,and contraction,which are related to function words and depend on preceding or following words in Korean spontaneous *** obtaining sufficient data for training language model is often difficult in a conversational domain,language model adaptation with large out-of-domain data is useful. For style-specific language model adaptation,first, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to style and content ***,style is represented by n-gram based tf*idf similarity. Second,we train in-domain language model including disfluency *** results show that n-gram based tfidf similarity weighting effectively reflects style difference and disfluencies can be used as a good predictor to the neighboring words.
<正>In this paper,we present a general system framework of Mandarin audio-visual large vocabulary continuous speech recognition(LVCSR),which integrates visual information for better recognition performance and ***...
详细信息
ISBN:
(纸本)0780379020
<正>In this paper,we present a general system framework of Mandarin audio-visual large vocabulary continuous speech recognition(LVCSR),which integrates visual information for better recognition performance and *** problems of audio-visual LVCSR are mainly addressed:lip tracking,visual feature extraction and audio-visual ***,the linear transform based lip tracking and low-level visual feature extraction methods are presented in comparison with the lip contour based feature ***,the audio-visual fusion strategy based on multi-stream hidden Markov model(MSHMM) is investigated and a novel approach is presented for training global or state-dependent stream weights using minimum classification error(MCE) *** is shown by experimental results that,with the visual information introduced,the word error rate(WER) of LVCSR system is reduced by 36.09%relatively in the case of clean audio,and the system robustness is also enhanced significantly in noise environment.
In human-machine interactions,an effective way to improve the accuracy of semantic analysis is to deduce the users'intentions by setting expectations. This paper mainly illustrates how to extract common structures...
详细信息
ISBN:
(纸本)0780379020
In human-machine interactions,an effective way to improve the accuracy of semantic analysis is to deduce the users'intentions by setting expectations. This paper mainly illustrates how to extract common structures and processing methods of the Expectation Model(EM) from scene-specified expectation setting. Then we put forth the algorithms of building and applying EM in Spoken Dialogue Systems(SDS's).By analyzing the characteristics of the system tasks' structure,our algorithms can be used to generate appropriate *** incorporated into the dialogue context,the EM can help create more reasonable dialogue situation,which endows the system with the preliminary ability to deduce users'intentions by reference to this situation,so as to improve the robustness of semantic analysis and the transaction -success rate.
Many attempts have been made to reduce burden and difficulty of CG (Computer Graphics) movie production. One of the biggest evolutions is an automatic control approach driven by commands or language, instead of tradit...
详细信息
ISBN:
(纸本)0889863806
Many attempts have been made to reduce burden and difficulty of CG (Computer Graphics) movie production. One of the biggest evolutions is an automatic control approach driven by commands or language, instead of traditional direct manipulation. In this paper, we propose methods and tools for automatic and directorial camera control, which are invoked by a text script. This approach is based on the knowledge base of cinematographic rules, which are provided by the movie analysis. This algorithm is being implemented in the DMP (Digital Movie Producer) system that aims at visualizing script written in simple naturallanguage.
暂无评论