In recent years, Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) trained with the Connectionist Temporal Classification (CTC) objective won many international handwriting recognition evaluations. The CTC ...
详细信息
In recent years, Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) trained with the Connectionist Temporal Classification (CTC) objective won many international handwriting recognition evaluations. The CTC algorithm is based on a forward-backward procedure, avoiding the need of a segmentation of the input before training. The network outputs are characters labels, and a special non-character label. On the other hand, in the hybrid Neural Network / Hidden Markov Models (NN/HMM) framework, networks are trained with framewise criteria to predict state labels. In this paper, we show that CTC training is close to forward-backward training of NN/HMMs, and can be extended to more standard HMM topologies. We apply this method to Multi-Layer Perceptrons (MLPs), and investigate the properties of CTC, namely the modeling of character by single labels and the role of the special label.
In this paper we present the handwriting recognition systems submitted by the LIMSI to the HTRtS 2014 contest. The systems for both the restricted and unrestricted tracks consisted of combination of several optical mo...
详细信息
In this paper we present the handwriting recognition systems submitted by the LIMSI to the HTRtS 2014 contest. The systems for both the restricted and unrestricted tracks consisted of combination of several optical models. We extracted handcrafted features as well as pixels values with a sliding window. We trained Deep Neural Networks (DNNs) and Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNNs), which where plugged as the optical model in Hidden Markov Models (HMMs). We propose a novel method to build language models that can cope with hyphenation in the text. The combination was performed from lattices generated from the different systems. We were the only team participating in both tracks and ranked second in each. The final Word Error Rates were 15.0% and 11.0% for the restricted (resp. unrestricted) track. We studied the impact of adding data for optical and language modeling. After the evaluation, we also used the same corpus for the language model as the winning team and obtained comparable results.
This paper introduces the rwth-PHOENIX-Weather 2014, a video-based, large vocabulary, German sign language corpus which has been extended over the last two years, tripling the size of the original corpus. The corpus c...
详细信息
ISBN:
(纸本)9782951740884
This paper introduces the rwth-PHOENIX-Weather 2014, a video-based, large vocabulary, German sign language corpus which has been extended over the last two years, tripling the size of the original corpus. The corpus contains weather forecasts simultaneously interpreted into sign language which were recorded from German public TV and manually annotated using glosses on the sentence level and semi-automatically transcribed spoken German extracted from the videos using the open-source speech recognition system RASR. Spatial annotations of the signers' hands as well as shape and orientation annotations of the dominant hand have been added for more than 40k respectively 10k video frames creating one of the largest corpora allowing for quantitative evaluation of object tracking algorithms. Further, over 2k signs have been annotated using the SignWriting annotation system, focusing on the shape, orientation, movement as well as spatial contacts of both hands. Finally, extended recognition and translation setups are defined, and baseline results are presented.
Domain adaptation for statistical machine translation is the task of altering general models to improve performance on the test domain. In this work, we suggest several novel weighting schemes based on translation mod...
详细信息
In this paper, we present two improvements to the beam search approach for solving homophonic substitution ciphers presented in Nuhn et al. (2013): An improved rest cost estimation together with an optimized strategy ...
详细信息
In this work, we tackle the problem of language and translation models domainadaptation without explicit bilingual indomain training data. In such a scenario, the only information about the domain can be induced from ...
详细信息
This paper describes the statistical machine translation (SMT) systems developed at rwthaachenuniversity for the German!English translation task of the ACL 2014 Eighth Workshop on Statistical Machine Translation (WM...
详细信息
This paper deals with robust modelling of mouth shapes in the context of sign languagerecognition using deep convolutional neural networks. Sign language mouth shapes are difficult to annotate and thus hardly any pub...
详细信息
This paper deals with robust modelling of mouth shapes in the context of sign languagerecognition using deep convolutional neural networks. Sign language mouth shapes are difficult to annotate and thus hardly any publicly available annotations exist. As such, this work exploits related information sources as weak supervision. humans mainly look at the face during sign language communication, where mouth shapes play an important role and constitute natural patterns with large variability. However, most scientific research on sign languagerecognition still disregards the face. Hardly any works explicitly focus on mouth shapes. This paper presents our advances in the field of sign languagerecognition. We contribute in following areas: We present a scheme to learn a convolutional neural network in a weakly supervised fashion without explicit frame labels. We propose a way to incorporate neural network classifier outputs into a HMM approach. Finally, we achieve a significant improvement in classification performance of mouth shapes over the current state of the art.
We present a novel toolkit that implements the long short-term memory (LSTM) neural network concept for language modeling. The main goal is to provide a software which is easy to use, and which allows fast training of...
详细信息
This paper investigates the application of vector space models (VSMs) to the standard phrase-based machine translation pipeline. VSMs are models based on continuous word representations embedded in a vector space. We ...
详细信息
暂无评论