Bilingual lexicons form a critical component of various natural languageprocessing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval. We improve biling...
详细信息
This paper tackles automatically discovering phone-like acoustic units (AUD) from unlabeled speech data. Past studies usually proposed single-step approaches. We propose a two-stage approach: the first stage learns a ...
详细信息
This paper proposes a parallel computation strategy and a posterior-based lattice expansion algorithm for efficient lattice rescoring with neural language models (LMs) for automatic speech recognition. First, lattices...
详细信息
Connectionist Temporal Classification (CTC) is a widely used approach for automatic speech recognition (ASR) that performs conditionally independent monotonic alignment. However for translation, CTC exhibits clear lim...
详细信息
Traditional multi-task learning architectures train a single model across multiple tasks through a shared encoder followed by task-specific decoders. Learning these models often requires specialized training algorithm...
详细信息
Named-entities are inherently multilingual, and annotations in any given language may be limited. This motivates us to consider polyglot named-entity recognition (NER), where one model is trained using annotated data ...
We introduce asynchronous dynamic decoder, which adopts an efficient A~* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pa...
详细信息
We introduce asynchronous dynamic decoder, which adopts an efficient A~* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing "exploration" and the other "backfill". The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.
Recover drawing orders from a Chinese handwriting image is a challenge issue. Most of English drawing order recovery( DOR) methods perform unsatisfactorily in Chinese. This paper proposes a novel image-to-sequence alg...
详细信息
ISBN:
(纸本)9781479981311
Recover drawing orders from a Chinese handwriting image is a challenge issue. Most of English drawing order recovery( DOR) methods perform unsatisfactorily in Chinese. This paper proposes a novel image-to-sequence algorithm to deal with Chinese DOR problem. The proposed method utilizes two regression convolution neural network(CNN) models to generate two corresponding pen-tip movement heat-maps. To estimate pen-tip movement for most of the normal states in writing process, the algorithm analyzes the above two heat-maps with a specifically designed framework. Then the drawing order is restored through a simple iteration process based on the proposed framework. Experiments on public online handwriting database show that our method have got a remarkable result for Chinese DOR tasks. In addition, for English tasks, our method performs superiorly among state-of-the-art methods.
Unsupervised spoken term discovery consists of two tasks: finding the acoustic segment boundaries and labeling acoustically similar segments with the same labels. We perform segmentation based on the assumption that t...
详细信息
Zero-shot multi-speaker Text-to-speech (TTS) generates target speaker voices given an input text and the corresponding speaker embedding. In this work, we investigate the effectiveness of the TTS reconstruction object...
详细信息
暂无评论