Word recognition testing may be defined as a procedure to assess a listener’s ability to identify one-syllable words (such as phonetically-balanced/PB words) that are presented at a given suprathreshold level to arri...
Word recognition testing may be defined as a procedure to assess a listener’s ability to identify one-syllable words (such as phonetically-balanced/PB words) that are presented at a given suprathreshold level to arrive at a word recognition score. For Thai, Thammasat University and Ramathibodi Hospital Phonetically Balanced Word Lists 2015 (TU-RAMA PB’15) were created with five lists, each with 25 monosyllabic words. Besides its phoneme distributions being based on large-scale Thai spoken corpora, TU-RAMA PB’15 is in line with TU PB’14 with emphasis on phonetic balance, symmetrical phoneme occurrence, and word familiarity. To evaluate its homogeneity in terms of decibel intelligibility, the lists were recorded and presented to 10 normal hearing participants, ranging from 0 to 50 dB HL in 2 dB increments (ascending order) until they repeated correct verbal responses. Using logistic regression, regression slopes and intercepts were calculated to estimate percentage of correct performance at any given intensity and to construct psychometric functions for every list. Derived psychometric function slopes ranged from 0.2015 to 0.2262 while intensities required for 50% intelligibility ranged from 17.0876 to 20.8856. Two-way Chi-Square analysis performed on both parameters indicated that there was no significant difference among the five lists.
We propose a simple and effective method to build a meta-level Statistical Machine Translation (SMT), called meta-SMT, for system combination. Our approach is based on the framework of Stacked Generalization, also kno...
详细信息
Parallel text is the fuel that drives modern machine translation systems. The Web is a comprehensive source of preexisting parallel text, but crawling the entire web is impossible for all but the largest companies. We...
详细信息
In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-g...
详细信息
ISBN:
(纸本)9781479903573
In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-gram character-level language model for out-of-vocabulary word detection and recognition. The probabilities assigned by those two models are combined into one Bayes decision rule. We evaluate the proposed method on the IAM database of English handwriting. An improvement from 22.2% word error rate to 17.3% is achieved comparing to the closed-vocabulary scenario and the best published result.
This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine ...
详细信息
ISBN:
(纸本)9781479903573
This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine different feature sets such as concatenation or linear discriminant analysis (LDA). Even though all these techniques achieve reasonable improvements, feature combination by multi-layer perceptrons (MLPs) outperforms all known approaches. We develop the concept of MLP based feature combination further using recurrent neural networks (RNNs). The phoneme posterior estimates derived from an RNN lead to a significant improvement over the result of the MLPs and achieve a 5% relative better word error rate (WER) with much less parameters. Moreover, we improve the system performance further by combining an MLP and an RNN in a hierarchical framework. The MLP benefits from the preprocessing of the RNN. All NNs are trained on phonemes. Nevertheless, the same concepts could be applied using context-dependent states. In addition to the improvements in recognition performance w.r.t. WER, NN based feature combination methods reduce both, the training and the testing complexity. Overall, the systems are based on a single set of acoustic models, together with the training of different NNs.
GMM-UBM-based speaker verification heavily relies on a well trained UBM. In practice, it is not often easy to obtain an UBM that fully matches acoustic channels in operation. To solve this problem, we propose a novel ...
详细信息
Educators and researchers have long recognized the importance of formative feedback for learning. Formative feedback helps learners understand where they are in a learning process, what the goal is, and how to reach t...
详细信息
We proposed a method for constructing constant-weight and multi-valued sequences from the cyclic difference sets by generalization of the method in binary case proposed by N. Li, X. Zeng and L. Hu in 2008. In this pap...
详细信息
Egyptian Arabic (EA) is a colloquial version of Arabic. It is a low-resource morphologically rich language that causes problems in Large Vocabulary Continuous Speech Recognition (LVCSR). Building LMs on morpheme level...
详细信息
Many past studies showed that Internet addiction negatively affected the interpersonal relationship. However, new functions on the Internet provide more online interactions, especially some social websites such as Fac...
详细信息
暂无评论