作者:
Bojar, OndřejWu, DekaiCharles University in Prague
Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics Czech Republic HKUST
Human Language Technology Center Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong
HMEANT (Lo and Wu, 2011a) is a manual MT evaluation technique that focuses on predicate-argument structure of the sentence. We relate HMEANT to an established linguistic theory, highlighting the possibilities of reusi...
Mobile phone users in rural parts of the developing world, especially Africa, adapt to lack of electricity, poverty, remote locations, unpredictable services, and second-hand technology. Meanwhile, the technology deve...
详细信息
In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of decoding by minimizing the number of la...
详细信息
We present an unsupervised approach to estimate the appropriate degree of contribution of each semantic role type for semantic translation evaluation, yielding a semantic MT evaluation metric whose correlation with hu...
详细信息
ISBN:
(纸本)9781627483452
We present an unsupervised approach to estimate the appropriate degree of contribution of each semantic role type for semantic translation evaluation, yielding a semantic MT evaluation metric whose correlation with human adequacy judgments is comparable to that of recent supervised approaches but without the high cost of a human-ranked training corpus. Our new unsupervised estimation approach is motivated by an analysis showing that the weights learned from supervised training are distributed in a similar fashion to the relative frequencies of the semantic roles. Empirical results show that even without a training corpus of human adequacy rankings against which to optimize correlation, using instead our relative frequency weighting scheme to approximate the importance of each semantic role type leads to a semantic MT evaluation metric that correlates comparable with human adequacy judgments to previous metrics that require far more expensive human rankings of adequacy over a training corpus. As a result, the cost of semantic MT evaluation is greatly reduced.
A major challenge for Arabic Large Vocabulary Continuous Speech Recognition (LVCSR) is the rich morphology of Arabic, which leads to high Out-of-vocabulary (OOV) rates, and poor language Model (LM) probabilities. In s...
详细信息
A major challenge for Arabic Large Vocabulary Continuous Speech Recognition (LVCSR) is the rich morphology of Arabic, which leads to high Out-of-vocabulary (OOV) rates, and poor language Model (LM) probabilities. In such cases, the use of morphemes rather than full-words is considered a better choice for LMs. Thereby, higher lexical coverage and less LM perplexities are achieved. On the other side, an effective way to increase the robustness of LMs is to incorporate features of words into LMs. In this paper, we investigate the use of features derived for morphemes rather than words. Thus, we combine the benefits of both morpheme level and feature rich modeling. We compare the performance of stream-based, class-based and Factored LMs (FLMs) estimated over sequences of morphemes and their features for performing Arabic LVCSR. A relative reduction of 3.9% in Word Error Rate (WER) is achieved compared to a word-based system.
A part-tone decomposition of voiced sections of speech is introduced, which is adapted with high accuracy to the frequency of the glottal oscillator of the speaker. The iterative replacement of the center filter frequ...
详细信息
A part-tone decomposition of voiced sections of speech is introduced, which is adapted with high accuracy to the frequency of the glottal oscillator of the speaker. The iterative replacement of the center filter frequency contours (chosen locally as linear chirp) of the non-stationary bandpass filters converges extremely fast and leads to the extraction of filter-stable part-tones with uncorrupted phases. In contrast to phases of frequency decomposition with a priori defined, constant filter frequencies, the phase differences of filter-stable part-tones promise to become a useful supplement of the amplitude based acoustic features used for conventional automatic speech recognition. The derived phase features are tested in vowel classification experiments based on the phonetically rich TIMIT database.
Multi Layer Perceptron (MLP) features extracted from different types of critical band energies (CRBE) - derived from MFCC, GT, and PLP pipeline - are compared on French broadcast news and conversational speech recogni...
详细信息
Multi Layer Perceptron (MLP) features extracted from different types of critical band energies (CRBE) - derived from MFCC, GT, and PLP pipeline - are compared on French broadcast news and conversational speech recognition task. Though the MLP structure is kept fixed, ROVER combination of different CRBE based systems leads to 4% relative improvement. Furthermore, aiming at the combination of state-of-the-art features based on various signal analysis methods into one single stream, posterior feature space based combination technique is proposed. The speaker normalized features originated from different CRBEs are merged after additional MLP training by Dempster-Shafer rule. The performance of these posterior features unifying the different CRBE based features is superior to the best single CRBE based posterior features by 6% relative. Further results reveal that the concatenated cepstral and unified posterior features perform nearly as well as the ROVER combination of the different CRBE based systems.
Models for silence are a fundamental part of continuous speech recognition systems. Depending on application requirements, audio data segmentation, and availability of detailed training data annotations, it may be nec...
详细信息
Models for silence are a fundamental part of continuous speech recognition systems. Depending on application requirements, audio data segmentation, and availability of detailed training data annotations, it may be necessary or beneficial to differentiate between other non-speech events, for example breath and background noise. The integration of multiple non-speech models in a WFST-based dynamic network decoder is not straightforward, because these models do not perfectly fit in the transducer framework. This paper describes several options for the transducer construction with multiple non-speech models, shows their considerable different characteristics in memory and runtime efficiency, and analyzes the impact on the recognition performance.
We investigate whether the small-world topology of a functional brain network means high information processing efficiency by calculating the correlation between the small-world measures of a functional brain network ...
详细信息
We investigate whether the small-world topology of a functional brain network means high information processing efficiency by calculating the correlation between the small-world measures of a functional brain network and behavioral reaction during an imagery *** brain networks are constructed by multichannel eventrelated potential data,in which the electrodes are the nodes and the functional connectivities between them are the *** results show that the correlation between small-world measures and reaction time is task-specific,such that in global imagery,there is a positive correlation between the clustering coefficient and reaction time,while in local imagery the average path length is positively correlated with the reaction *** suggests that the efficiency of a functional brain network is task-dependent.
暂无评论