Phoneme-based acoustic modeling of large vocabulary automatic speech recognition takes advantage of phoneme context. The large number of context-dependent (CD) phonemes and their highly varying statistics require tyin...
详细信息
We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words. We scale up the expectation-maximization (EM) algorithm to learn a large translation table wit...
详细信息
This paper describes the system used by the ValenTo team in the Task 11, Sentiment Analysis of Figurative language in Twitter, at SemEval 2015. Our system used a regression model and additional external resources to a...
详细信息
In this paper, we empirically investigate the impact of critical configuration parameters in the popular cube pruning algorithm for decoding in hierarchical statistical machine translation. Specifically, we study how ...
详细信息
In this work, we present novel warping algorithms for full 2D pixel-grid deformations for face recognition. Due to high variation in face appearance, face recognition is considered a very difficult task, especially if...
详细信息
Recently, state-of-the-art recognition accuracies for pose-invariant face recognition have been achieved by using 2D-Warping methods in a nearest-neighbor framework. However, the main drawback of these methods is the ...
详细信息
Recently, state-of-the-art recognition accuracies for pose-invariant face recognition have been achieved by using 2D-Warping methods in a nearest-neighbor framework. However, the main drawback of these methods is the high computational complexity. In this paper we address this issue. We use a simple and fast method to get a rough estimate of a 2D-Warping. This estimate can then be used to apply an image dependent warprange on the 2D-Warping algorithm, limit the possible poses or preselect the most likely classes. By this method we are able significantly reduce the runtime of a recently proposed 2D-Warping algorithm without sacrificing recognition accuracy.
This paper describes the RWTH system for large vocabulary Arabic handwriting recognition. The recognizer is based on Hidden Markov Models (HMMs) with state of the art methods for visual/language modeling and decoding....
详细信息
This paper describes the RWTH system for large vocabulary Arabic handwriting recognition. The recognizer is based on Hidden Markov Models (HMMs) with state of the art methods for visual/language modeling and decoding. The feature extraction is based on Recurrent Neural Networks (RNNs) which estimate the posterior distribution over the character labels for each observation. Discriminative training using the Minimum Phone Error (MPE) criterion is used to train the HMMs. The recognition is done with the help of n-gram language Models (LMs) trained using in-domain text data. Unsupervised writer adaptation is also performed using the Constrained Maximum Likelihood Linear Regression (CMLLR) feature adaptation. The RWTH Arabic handwriting recognition system gave competitive results in previous handwriting recognition competitions. The used techniques allows to improve the performance of the system participating in the OpenHaRT 2013 evaluation.
In this paper, we propose a novel semantic cohesion model. Our model utilizes the predicateargument structures as soft constraints and plays the role as a reordering model in the phrasebased statistical machine transl...
详细信息
ASR can be improved by multi-task learning (MTL) with domain enhancing or domain adversarial training, which are two opposite objectives with the aim to increase/decrease domain variance towards domain-aware/agnostic ...
详细信息
As one of the most popular sequence-to-sequence modeling approaches for speech recognition, the RNN-Transducer has achieved evolving performance with more and more sophisticated neural network models of growing size a...
详细信息
暂无评论