This paper discusses the vowel pronunciation quality assessment of our computerassisted Mandarin Chinese learning system. Under the speech recognition framework, phonetic pronunciation assessment is usually based on ...
详细信息
ISBN:
(纸本)9781424414833
This paper discusses the vowel pronunciation quality assessment of our computerassisted Mandarin Chinese learning system. Under the speech recognition framework, phonetic pronunciation assessment is usually based on the phonetic posterior probability score, which may be computed by normalizing the frame-based posterior probability or be calculated on the phone segment directly. By the first method, we can achieve a human-machine scoring correlation coefficient (CC) of 0.832 for vowel;and by the second, the CC can be up to 0.847. In order to improve the performance, we suggest employing the formant feature of vowel. This paper proposes a novel method to utilize formant: we plot formant candidates of each frame on the time-frequency plane to form a bitmap, and then extract its Gabor feature for pattern classification. When we use the classification probability score for pronunciation assessment, we get a CC of 0.842. Finally we combine the three scores with various linear or nonlinear methods;the best CC of 0.913 is gotten by using neural network.
In this paper, we describe the methodology for collecting and annotating a new database designed for conducting research and development on pronunciation assessment. While a significant amount of research has been don...
详细信息
In this paper, we describe the methodology for collecting and annotating a new database designed for conducting research and development on pronunciation assessment. While a significant amount of research has been done in the area of pronunciation assessment, to our knowledge, no database is available for public use for research in the field. Considering this need, we created EpaDB (English Pronunciation by Argentinians Database), which is composed of English phrases read by native Spanish speakers with different levels of English proficiency. The recordings are annotated with ratings of pronunciation quality at phrase-level and detailed phonetic alignments and transcriptions indicating which phones were actually pronounced by the speakers. We present inter-rater agreement, the effect of each phone on overall perceived non-nativeness, and the frequency of specific pronunciation errors.
Traditional approaches to teaching a foreign language in Artificial Intelligence (AI) systems restrict the computer to correcting grammar or vocabulary; this is similar to structural teaching strategies (STS) in which...
详细信息
Traditional approaches to teaching a foreign language in Artificial Intelligence (AI) systems restrict the computer to correcting grammar or vocabulary; this is similar to structural teaching strategies (STS) in which a second language (L2) is learned in relation to a first language (L1). By contrast, this paper suggests that the role of a computer tutor in languagelearning need not necessarily be confined to that of a foreign language expert. It investigates ways in which to represent both the learner's L2 skills and the foreign language expert. These lead to a novel computational method for teaching a foreign language, suitable for children aged 7 to 15.
Sentence level pronunciation assessment is important for computer assisted language learning (CALL). Traditional speech pronunciation assessment, based on the Goodness of Pronunciation (GOP) algorithm, has some weakne...
详细信息
ISBN:
(纸本)9781728176055
Sentence level pronunciation assessment is important for computer assisted language learning (CALL). Traditional speech pronunciation assessment, based on the Goodness of Pronunciation (GOP) algorithm, has some weakness in assessing a speech utterance: 1) Phoneme GOP scores cannot be easily translated into a sentence score with a simple average for effective assessment;2) The rank ordering information has not been well exploited in GOP scoring for delivering a robust assessment and correlate well with a human rater's evaluations. In this paper, we propose two new statistical features, average GOP (aGOP) and confusion GOP (cGOP) and use them to train a binary classifier in Ordinal Regression with Anchored Reference Samples (ORARS). When the proposed approach is tested on Microsoft mTutor ESL Dataset, a relative improvement of Pearson correlation coefficient of 26.9% is obtained over the conventional GOP-based one. The performance is at a human-parity level or better than human raters.
This paper discusses Mandarin vowel pronunciation quality assessment. The phonetic pronunciation quality is traditionally evaluated under the speech recognition framework by the phonetic posterior probability score, w...
详细信息
ISBN:
(纸本)9783540748175
This paper discusses Mandarin vowel pronunciation quality assessment. The phonetic pronunciation quality is traditionally evaluated under the speech recognition framework by the phonetic posterior probability score, which may be computed by normalizing the frame-based posterior probability or be calculated on the phone segment directly. By the first method, we can achieve a human-machine scoring correlation coefficient (CC) of 0.832 for vowel;and by the second, the CC can be up to 0.847. This paper proposes a novel kind of formant feature and applies the feature to the evaluation of vowel: we transform the formant plots on the time-frequency plane to a bitmap and extract its Gabor feature for pattern classification;when use the classification probability for pronunciation assessment, we can get a CC of 0.842. Finally we combine the three scores with various linear or nonlinear methods;the best CC of 0.913 is gotten by using neural network.
This paper proposes a PDA-based languagelearning system for Japanese polite expressions. To support the foreigners learning Japanese polite expressions, we had implemented a PDA-based system, called Japanese polite e...
详细信息
ISBN:
(纸本)9781586037970
This paper proposes a PDA-based languagelearning system for Japanese polite expressions. To support the foreigners learning Japanese polite expressions, we had implemented a PDA-based system, called Japanese polite expressions learning assisting system (JAPELAS) [11]. JAPELAS is a one-to-one system, and it works without any input of the context information. Based on the JAPELAS, now we propose architecture of new one-to-many system called JAPELAS2, where the learner can interact with many persons in the same situation. Learners call use JAPELAS2 to review the history records that they have learned.
Through the integration of the linguistic and computer science perspective of corrective feedback, this paper seeks to examine the forms of feedback present in current CALL software. This paper also expounds on the si...
详细信息
ISBN:
(纸本)9781728134857
Through the integration of the linguistic and computer science perspective of corrective feedback, this paper seeks to examine the forms of feedback present in current CALL software. This paper also expounds on the similarities and differences in the identification of feedback from these perspectives.
The growing demand for learning English as a second language has increased interest in automatic approaches for assessing and improving spoken language proficiency. A significant challenge in this field is to provide ...
详细信息
The growing demand for learning English as a second language has increased interest in automatic approaches for assessing and improving spoken language proficiency. A significant challenge in this field is to provide interpretable scores and informative feedback to learners through individual viewpoints of learners' proficiency, as opposed to holistic scores. Thus far, holistic scoring remains commonly applied in large-scale commercial tests. As a result, an issue with more detailed evaluation is that human graders are generally trained to provide holistic scores. This paper investigates whether view-specific systems can be trained when only holistic scores are available. To enable this process, view-specific networks are defined where both their inputs and structure are adapted to focus on specific facets of proficiency. It is shown that it is possible to train such systems on holistic scores, such that they provide view-specific scores at evaluation time. View-specific networks are designed in this way for pronunciation, rhythm, text, use of parts of speech and grammatical accuracy. The relationships between the predictions of each system are investigated on the spoken part of the Linguaskill proficiency test. It is shown that the view-specific predictions are complementary in nature and capture different information about proficiency.
Studies on the effectiveness of online language teaching have generally centered on basic or intermediate language courses. The present study examines the effectiveness of an advanced-level online Spanish grammar cour...
详细信息
Studies on the effectiveness of online language teaching have generally centered on basic or intermediate language courses. The present study examines the effectiveness of an advanced-level online Spanish grammar course. Two sections of the course are compared: one is offered face-to-face, and the other is offered fully online. The goals are both to measure students' achievement in the two sections, and to better understand specific challenges faced by online teaching. The study shows that there was significant improvement (learning) in the online section, and that learning is indeed comparable to that shown in the face-to-face section. However, we identify and discuss one specific challenge faced by an online format: the different nature of the interaction between the learner and the learning environment. Los estudios sobre la ensenanza de lenguas en linea generalmente se han enfocado en cursos de nivel basico o intermedio. El presente trabajo examina la efectividad de un curso avanzado de gramatica espanola. Se comparan dos secciones del mismo curso: una ofrecida en linea y la otra presencial. El objetivo es medir el rendimiento de los estudiantes y entender mejor factores especificos que pueden afectar su aprendizaje en linea. El estudio muestra que hay un aprendizaje significativo en la seccion en linea, y que el aprendizaje es comparable al experimentado por los estudiantes en la seccion presencial. Identificamos tambien un reto especifico que enfrentan los estudiantes en linea: la diferente naturaleza de la interaccion entre el estudiante y su entorno de aprendizaje.
We present a spoken dialog-based framework for the computer-assistedlanguagelearning (CALL) of conversational English. In particular, we leveraged the open-source HALEF dialog framework to develop a job interview co...
详细信息
ISBN:
(纸本)9781510848764
We present a spoken dialog-based framework for the computer-assistedlanguagelearning (CALL) of conversational English. In particular, we leveraged the open-source HALEF dialog framework to develop a job interview conversational application. We then used crowdsourcing to collect multiple interactions with the system from non-native English speakers. We analyzed human-rated scores of the recorded dialog data on three different scoring dimensions critical to the delivery of conversational English - fluency, pronunciation and intonation/stress - and further examined the efficacy of automatically-extracted, hand-curated speech features in predicting each of these sub-scores. Machine learning experiments showed that trained scoring models generally perform at par with the human inter-rater agreement baseline in predicting human-rated scores of conversational proficiency.
暂无评论