This paper investigates the application of vector space models (VSMs) to the standard phrase-based machine translation pipeline. VSMs are models based on continuous word representations embedded in a vector space. We ...
详细信息
This work presents two different translation models using recurrent neural networks. The first one is a word-based approach using word alignments. Second, we present phrase-based translation models that are more consi...
详细信息
Manual analysis and decryption of enciphered documents is a tedious and error prone work. Often-even after spending large amounts of time on a particular cipher-no decipherment can be found. Automating the decryption ...
详细信息
This paper describes the RWTH system for large vocabulary Arabic handwriting recognition. The recognizer is based on Hidden Markov Models (HMMs) with state of the art methods for visual/language modeling and decoding....
详细信息
This paper describes the RWTH system for large vocabulary Arabic handwriting recognition. The recognizer is based on Hidden Markov Models (HMMs) with state of the art methods for visual/language modeling and decoding. The feature extraction is based on Recurrent Neural Networks (RNNs) which estimate the posterior distribution over the character labels for each observation. Discriminative training using the Minimum Phone Error (MPE) criterion is used to train the HMMs. The recognition is done with the help of n-gram language Models (LMs) trained using in-domain text data. Unsupervised writer adaptation is also performed using the Constrained Maximum Likelihood Linear Regression (CMLLR) feature adaptation. The RWTH Arabic handwriting recognition system gave competitive results in previous handwriting recognition competitions. The used techniques allows to improve the performance of the system participating in the OpenHaRT 2013 evaluation.
We present a method for training an off-line handwriting recognition system in an unsupervised manner. For an isolated word recognition task, we are able to bootstrap the system without any annotated data. We then ret...
详细信息
We present a method for training an off-line handwriting recognition system in an unsupervised manner. For an isolated word recognition task, we are able to bootstrap the system without any annotated data. We then retrain the system using the best hypothesis from a previous recognition pass in an iterative fashion. Our approach relies only on a prior language model and does not depend on an explicit segmentation of words into characters. The resulting system shows a promising performance on a standard dataset in comparison to a system trained in a supervised fashion for the same amount of training data.
The task of fine-grained visual categorization is related to both general object recognition and specialized tasks such as face recognition. Hence, we propose to combine two methods popular for general object recognit...
详细信息
The task of fine-grained visual categorization is related to both general object recognition and specialized tasks such as face recognition. Hence, we propose to combine two methods popular for general object recognition and face recognition to build a new model-free system for fine-grained visual categorization. Specifically, we use Local Naive-Bayes Nearest Neighbor as a pre-selection method and 2D-Warping as a refinement step. For the latter, we explore different ways to use the alignments computed by a 2D-Warping algorithm for classification. We demonstrate the performance of our approach on the CUB200-2011 database and show that our approach outperforms the recognition accuracy of current state-of-the-art methods.
We propose a state-of-the-art system for recognizing real-world handwritten images exposing a huge degree of noise and a high out-of-vocabulary rate. We describe methods for successful image demising, line removal, de...
详细信息
We propose a state-of-the-art system for recognizing real-world handwritten images exposing a huge degree of noise and a high out-of-vocabulary rate. We describe methods for successful image demising, line removal, deskewing, deslanting, and text line segmentation. We demonstrate how to use a HMM-based recognition system to obtain competitive results, and how to further improve it using LSTM neural networks in the tandem approach. The final system outperforms other approaches on a new dataset for English and French handwriting. The presented framework scales well across other standard datasets.
In mathematical expression recognition, symbol classification is a crucial step. Numerous approaches for recognizing handwritten math symbols have been published, but most of them are either an online approach or a hy...
详细信息
In mathematical expression recognition, symbol classification is a crucial step. Numerous approaches for recognizing handwritten math symbols have been published, but most of them are either an online approach or a hybrid approach. There is an absence of a study focused on offline features for handwritten math symbol recognition. Furthermore, many papers provide results difficult to compare. In this paper we assess the performance of several well-known offline features for this task. We also test a novel set of features based on polar histograms and the vertical repositioning method for feature extraction. Finally, we report and analyze the results of several experiments using recurrent neural networks on a large public database of online handwritten math expressions. The combination of online and offline features significantly improved the recognition rate.
This paper proposes the improvement of context dependent modeling for Arabic handwriting recognition. Since the number of parameters in context dependent models is huge, CART trees are used for state tying. This work ...
详细信息
This paper proposes the improvement of context dependent modeling for Arabic handwriting recognition. Since the number of parameters in context dependent models is huge, CART trees are used for state tying. This work is based on a new set of questions for the CART tree construction based on a "lossy mapping" categorization of the Arabic shapes. The used system is a combination of Hidden Markov Models and Recurrent Neural Networks using the hybrid approach. A comparison between a Neural network trained using the baseline labels and another one based on the CART tree labels is done. The experimental results show that the use of the CART labels for the Neural Network training beneficial. The lossy mapping based CART tree performed better than the baseline system. An absolute improvement of 2.9% in terms of Word Error Rate is performed on the test set of the Open Hart database.
This paper describes the new release of RASR - the open source version of the well-proven speech recognition toolkit developed and used at RWTH Aachen University. The focus is put on the implementation of the NN modul...
详细信息
ISBN:
(纸本)9781479928941
This paper describes the new release of RASR - the open source version of the well-proven speech recognition toolkit developed and used at RWTH Aachen University. The focus is put on the implementation of the NN module for training neural network acoustic models. We describe code design, configuration, and features of the NN module. The key feature is a high flexibility regarding the network topology, choice of activation functions, training criteria, and optimization algorithm, as well as a built-in support for efficient GPU computing. The evaluation of run-time performance and recognition accuracy is performed exemplary with a deep neural network as acoustic model in a hybrid NN/HMM system. The results show that RASR achieves a state-of-the-art performance on a real-world large vocabulary task, while offering a complete pipeline for building and applying large scale speech recognition systems.
暂无评论