Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective t...
详细信息
The parametric Bayesian Feature Enhancement (BFE) and a datadriven Denoising Autoencoder (DA) both bring performance gains in severe single-channel speech recognition conditions. The first can be adjusted to different...
详细信息
We present a method to classify images into different categories of pornographic content to create a system for filtering pornographic images from network traffic. Although different systems for this application were ...
详细信息
The smoothing of n-gram models is a core technique in language modelling (LM). Modified Kneser-Ney (mKN) ranges among one of the best smoothing techniques. This technique discounts a fixed quantity from the observed c...
详细信息
The smoothing of n-gram models is a core technique in language modelling (LM). Modified Kneser-Ney (mKN) ranges among one of the best smoothing techniques. This technique discounts a fixed quantity from the observed counts in order to approximate the Turing-Good (TG) counts. Despite the TG counts optimise the leaving-one-out (L1O) criterion, the discounting parameters introduced in mKN do not. Moreover, the approximation to the TG counts for large counts is heavily simplified. In this work, both ideas are addressed: the estimation of the discounting parameters by L1O and better functional forms to approximate larger TG counts. The L1O performance is compared with cross-validation (CV) and mKN baseline in two large vocabulary tasks.
Pivot-based neural machine translation (NMT) is commonly used in low-resource setups, especially for translation between non-English language pairs. It benefits from using high-resource source→pivot and pivot→target...
详细信息
Context-dependent deep neural network HMMs have been shown to achieve recognition accuracy superior to Gaussian mixture models in a number of recent works. Typically, neural networks are optimized with stochastic grad...
详细信息
We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characte...
详细信息
ISBN:
(纸本)9781424421749
We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characters even within words. Here, a separate character model for white-spaces in combination with a lexicon using different writing variants and character model length adaptation is proposed. Current handwriting recognition systems model the white-spaces implicitly within the character models leading to possibly degraded models, or try to explicitly segment the Arabic words into pieces of Arabic words being prone to segmentation errors. Several white-space modeling approaches are analyzed on the well known IFN/ENIT database and outperform the best reported error rates.
The transcription of digitalised documents is useful to ease the digital access to their contents. Natural language technologies, such as Automatic Speech recognition (ASR) for speech audio signals and Handwritten Tex...
详细信息
State-of-the-art Natural languagerecognition systems allow transcribers to speed-up the transcription of audio, video or image documents. These systems provide transcribers an initial draft transcription that can be ...
详细信息
In this work, we present a model for document-grounded response generation in dialog that is decomposed into two components according to Bayes' theorem. One component is a traditional ungrounded response generatio...
详细信息
暂无评论