Computer assisted language learning (CALL) has attracted increasing interest in language teaching and learning. In the computer-supported learning environment, both pronunciation correction and expression modulation a...
详细信息
ISBN:
(纸本)9783319943619;9783319943602
Computer assisted language learning (CALL) has attracted increasing interest in language teaching and learning. In the computer-supported learning environment, both pronunciation correction and expression modulation are certified to be essential for contemporary learners. However, while mispronunciation detection and diagnosis (MDD) technologies have achieved significant successes, speech expression evaluation is still relied on expensive and resources consuming manual assessment. In this paper, we proposed a novel multi-modalmulti-scale neural network based approach for automatic speech expression evaluation in CALL. In particular, a multi-modal sparse auto encoder (MSAE) is firstly employed to make full use of both lexical and acoustic features, a recurrent autoencoder (RAE) is further employed to produce the features at different time scale and an attention-based multi-scale bidirectional long-short term memory (BLSTM) model is finally employed to score the speech expression. Experimental results using data collected from realistic airline broadcast evaluation demonstrate the effectiveness of the proposed approach, achieving a human-level predictive ability with acceptable rate 70.4%.
暂无评论