In this paper, the potential pronunciation errors often made by German learners of Mandarin are investigated through a perceptual evaluation experiment. In this evaluation experiment, 19 learners' Mandarin speech ...
详细信息
In this paper, the potential pronunciation errors often made by German learners of Mandarin are investigated through a perceptual evaluation experiment. In this evaluation experiment, 19 learners' Mandarin speech was collected in Berlin, perceptually annotated and transcribed by 12 native speakers in Taiwan. The annotation results, especially the strength of foreign accent and intelligibility scores are analyzed to revise a set of error hypotheses predicted by linguists. We expected to use the revised hypotheses to assist a larger scale database collection plan conducted in the next three years.
This paper examines the effectiveness of acoustic features in automatic classification of prosodic word boundary for Mandarin Text-to-Speech (TTS) database. The acoustic features we have examined include F0, energy, d...
详细信息
This paper examines the effectiveness of acoustic features in automatic classification of prosodic word boundary for Mandarin Text-to-Speech (TTS) database. The acoustic features we have examined include F0, energy, duration and phonetic information. Classification and Regression Tree (CART) is employed as the classifier. Experiment results show that the method can achieve up to 80.3% accuracy in prosodic word boundary detection with the acoustic and phonetic information.
The talk attempts to make available descriptions of emotion-related states in application-oriented technological contexts, especially in conversion speech corpus. There are different components of emotions, e.g. subje...
详细信息
The talk attempts to make available descriptions of emotion-related states in application-oriented technological contexts, especially in conversion speech corpus. There are different components of emotions, e.g. subjective component (Feelings), cognitive component (Appraisals), physiological component, behavioral component (Action tendencies), expressive component, etc.. The talk compiles a annotation method which describe the emotional conversation speech with rich emotional properties, including emotion categories, emotion dimensions, multiple and/or complex emotions, emotion intensity, emotion regulation, timing of emotions, action tendencies, modality and other related information. Some samples will be demonstrated on how to do the labeling.
This paper proposes a novel multi-feature fusion approach using Multi-GMM supervector and Support Vector Machine for text-independent speaker verification. By the UBM-MAP framework, the variable number of feature vect...
详细信息
This paper proposes a novel multi-feature fusion approach using Multi-GMM supervector and Support Vector Machine for text-independent speaker verification. By the UBM-MAP framework, the variable number of feature vectors (MFCC, LPCC) can be transformed into a vector (GMM supervector). Concatenating the GMM supervectors from different features, a new Multi-GMM supervector is formed for SVM. Experiments on text-independent speaker verification in NIST’04 10sec-10sec female data showed the successful fusion of MFCC and LPCC in feature level.
This paper describes a project that aims to create a Mandarin speech corpus. The goal of this database is used to build a high performance ASR system in car navigation system. This database includes 100 speakers, four...
详细信息
This paper describes a project that aims to create a Mandarin speech corpus. The goal of this database is used to build a high performance ASR system in car navigation system. This database includes 100 speakers, four channels speech data, various digit strings, sentences, and spontaneously queries have been collected in this corpus with totally 14,500 sentences which occupying 10 gigabytes of disk space.
A long-term dream of our human being is to develop a machine that can communicate using speech as ourselves. Reviewing the history of speech technology, we can see that the progresses of speech technology are always a...
详细信息
A long-term dream of our human being is to develop a machine that can communicate using speech as ourselves. Reviewing the history of speech technology, we can see that the progresses of speech technology are always accompanying with the progress in understanding of human speech mechanism. This talk will demonstrate the relation between the development of speech technology and progress of understanding human speech mechanism. The talker will show some gaps that exist between speech technology and speech science. Finally, the talker would like to share some ideas with the audience: how much we know our capabilities in speech processing, how possible the fundamental findings in human mechanisms can guide development of a novel technique for speech processing.
暂无评论