Recently, RNN-Transducers have achieved remarkable results on various automatic speech recognition tasks. However, lattice-free sequence discriminative training methods, which obtain superior performance in hybrid mod...
详细信息
In hybrid HMM based speech recognition, LSTM language models have been widely applied and achieved large improvements. The theoretical capability of modeling any unlimited context suggests that no recombination should...
详细信息
Sequence-to-sequence models with an implicit alignment mechanism (e.g. attention) are closing the performance gap towards traditional hybrid hidden Markov models (HMM) for the task of automatic speech recognition. One...
详细信息
Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions. The integration with an external LM trained on much more unpaired text usually leads to be...
详细信息
To improve the performance of state-of-the-art automatic speech recognition systems it is common practice to include external knowledge sources such as language models or prior corrections. This is usually done via lo...
详细信息
Speaker adaptation is important to build robust automatic speech recognition (ASR) systems. In this work, we investigate various methods for speaker adaptive training (SAT) based on feature-space approaches for a conf...
详细信息
Subword units are commonly used for end-to-end automatic speech recognition (ASR), while a fully acoustic-oriented subword modeling approach is somewhat missing. We propose an acoustic data-driven subword modeling (AD...
详细信息
We introduce a novel segmental-attention model for automatic speech recognition. We restrict the decoder attention to segments to avoid quadratic runtime of global attention, better generalize to long sequences, and e...
详细信息
We present our transducer model on Librispeech. We study variants to include an external language model (LM) with shallow fusion and subtract an estimated internal LM. This is justified by a Bayesian interpretation wh...
详细信息
Sequence discriminative training is a great tool to improve the performance of an automatic speech recognition system. It does, however, necessitate a sum over all possible word sequences, which is intractable to comp...
详细信息
暂无评论