We present a novel approach for translation model (TM) adaptation using phrase training. The proposed adaptation procedure is initialized with a standard general-domain TM, which is then used to perform phrase trainin...
详细信息
We present a novel approach for translation model (TM) adaptation using phrase training. The proposed adaptation procedure is initialized with a standard general-domain TM, which is then used to perform phrase trainin...
详细信息
In this paper, we empirically investigate the impact of critical configuration parameters in the popular cube pruning algorithm for decoding in hierarchical statistical machine translation. Specifically, we study how ...
详细信息
We show that most search errors can be identified by aligning the results of a symmetric forward and backward decoding pass. Based on this knowledge, we introduce an efficient high-level decoding architecture which yi...
详细信息
ISBN:
(纸本)9781479927579
We show that most search errors can be identified by aligning the results of a symmetric forward and backward decoding pass. Based on this knowledge, we introduce an efficient high-level decoding architecture which yields virtually no search errors, and requires virtually no manual tuning. We perform an initial forward- and backward decoding with tight initial beams, then we identify search errors, and then we recursively increment the beam sizes and perform new forward and backward decodings for erroneous intervals until no more search errors are detected. Consequently, each utterance and even each single word is decoded with the smallest beam size required to decode it correctly. On all tested systems we achieve an error rate equal or very close to classical decoding with ideally tuned beam size, but unsupervisedly without specific tuning, and at around 2 times faster runtime. An additional speedup by factor 2 can be achieved by decoding the forward and backward pass in separate threads.
Context-dependent deep neural network HMMs have been shown to achieve recognition accuracy superior to Gaussian mixture models in a number of recent works. Typically, neural networks are optimized with stochastic grad...
详细信息
In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-g...
详细信息
ISBN:
(纸本)9781479903573
In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-gram character-level language model for out-of-vocabulary word detection and recognition. The probabilities assigned by those two models are combined into one Bayes decision rule. We evaluate the proposed method on the IAM database of English handwriting. An improvement from 22.2% word error rate to 17.3% is achieved comparing to the closed-vocabulary scenario and the best published result.
In this paper we describe a novel HMM-based system for off-line handwriting recognition. We adapt successful techniques from the domains of large vocabulary speech recognition and image object recognition: moment-base...
详细信息
ISBN:
(纸本)9780769549993
In this paper we describe a novel HMM-based system for off-line handwriting recognition. We adapt successful techniques from the domains of large vocabulary speech recognition and image object recognition: moment-based image normalization, writer adaptation, discriminative feature extraction and training, and open-vocabulary recognition. We evaluate those methods and examine their cumulative effect on the recognition performance. The final system outperforms current state-of-the-art approaches on two standard evaluation corpora for English and French handwriting.
This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine ...
详细信息
ISBN:
(纸本)9781479903573
This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine different feature sets such as concatenation or linear discriminant analysis (LDA). Even though all these techniques achieve reasonable improvements, feature combination by multi-layer perceptrons (MLPs) outperforms all known approaches. We develop the concept of MLP based feature combination further using recurrent neural networks (RNNs). The phoneme posterior estimates derived from an RNN lead to a significant improvement over the result of the MLPs and achieve a 5% relative better word error rate (WER) with much less parameters. Moreover, we improve the system performance further by combining an MLP and an RNN in a hierarchical framework. The MLP benefits from the preprocessing of the RNN. All NNs are trained on phonemes. Nevertheless, the same concepts could be applied using context-dependent states. In addition to the improvements in recognition performance w.r.t. WER, NN based feature combination methods reduce both, the training and the testing complexity. Overall, the systems are based on a single set of acoustic models, together with the training of different NNs.
Egyptian Arabic (EA) is a colloquial version of Arabic. It is a low-resource morphologically rich language that causes problems in Large Vocabulary Continuous Speech recognition (LVCSR). Building LMs on morpheme level...
详细信息
Log-linear models find a wide range of applications in patternrecognition. The training of log-linear models is a convex optimization problem. In this work, we compare the performance of stochastic and batch optimiza...
详细信息
ISBN:
(纸本)9781479903573
Log-linear models find a wide range of applications in patternrecognition. The training of log-linear models is a convex optimization problem. In this work, we compare the performance of stochastic and batch optimization algorithms. Stochastic algorithms are fast on large data sets but can not be parallelized well. In our experiments on a broadcast conversations recognition task, stochastic methods yield competitive results after only a short training period, but when spending enough computational resources for parallelization, batch algorithms are competitive with stochastic algorithms. We obtained slight improvements by using a stochastic second order algorithm. Our best log-linear model outperforms the maximum likelihood trained Gaussian mixture model baseline although being ten times smaller.
暂无评论