Articulatory features (AFs) are viewed as the universal speech attributes for cross-language speech recognition. They are usually detected using a bank of multi-layer perceptrons (MLPs) in a supervised manner. In this...
详细信息
ISBN:
(纸本)9781479942190
Articulatory features (AFs) are viewed as the universal speech attributes for cross-language speech recognition. They are usually detected using a bank of multi-layer perceptrons (MLPs) in a supervised manner. In this paper, we propose to apply the deep learning method to detect AF-based speech attributes in a semi-supervised manner for cross-language speech recognition. The experimental results on Tibetan phone recognition showed that the deep learning method can detect the AF-based speech attributes more accurately and has higher phone recognition rates than MLPs.
cross-language speech recognition often assumes a certain amount of knowledge about the target language. However, there are hundreds of languages where not even the phoneme inventory is known. In the work reported her...
详细信息
ISBN:
(纸本)9781618392701
cross-language speech recognition often assumes a certain amount of knowledge about the target language. However, there are hundreds of languages where not even the phoneme inventory is known. In the work reported here, phone recognisers are evaluated on a cross-language task with minimum target knowledge. A phonetic distance measure is introduced for the evaluation, allowing a distance to be calculated between any utterance of any language. This has a number of spin-off applications such as allophone detection, a phone-based ROVER approach to recognition, and cross-language forced alignment. Results show that some of these novel approaches will be of immediate use in characterising languages where there is little phonological knowledge.
暂无评论