检索结果-内蒙古大学图书馆

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： David Rybach Ralf Schlüter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

Models for silence are a fundamental part of continuous speech recognition systems. Depending on application requirements, audio data segmentation, and availability of detailed training data annotations, it may be necessary or beneficial to differentiate between other non-speech events, for example breath and background noise. The integration of multiple non-speech models in a WFST-based dynamic network decoder is not straightforward, because these models do not perfectly fit in the transducer framework. This paper describes several options for the transducer construction with multiple non-speech models, shows their considerable different characteristics in memory and runtime efficiency, and analyzes the impact on the recognition performance.

关键词： Hidden Markov models Transducers Decoding Context Speech Noise Speech recognition

来源：评论

学校读者我要写书评

暂无评论

The rwth aachen Machine Translation System for WMT 2012 12

The RWTH Aachen Machine Translation System for WMT 2012

引用

Workshop on Statistical Machine Translation

作者： Matthias Huck Stephan Peitz Markus Freitag Malte Nuhn Hermann Ney Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University D-52056 Aachen Germany

ISBN: (纸本)9781622765928

This paper describes the statistical machine translation (SMT) systems developed at rwth aachen university for the translation task of the NAACL 2012 Seventh Workshop on Statistical Machine Translation (WMT 2012). We participated in the evaluation campaign for the French-English and German-English language pairs in both translation directions. Both hierarchical and phrase-based SMT systems are applied. A number of different techniques are evaluated, including an insertion model, different lexical smoothing methods, a discriminative reordering extension for the hierarchical system, reverse translation, and system combination. By application of these methods we achieve considerable improvements over the respective baseline systems.

关键词： machine translation system machine translation Surface mount technology Hierarchical application methods Translations Translation Translation Process smoothing methods Hierarchical systems

来源：评论

学校读者我要写书评

暂无评论

Phase difference of filter-stable part-tones as acoustic feature

Phase difference of filter-stable part-tones as acoustic fea...

引用

IEEE/SP Workshop on Statistical Signal Processing (SSP)

作者： Zoltán Tüske Friedhelm R. Drepper Ralf Schlüter Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

A part-tone decomposition of voiced sections of speech is introduced, which is adapted with high accuracy to the frequency of the glottal oscillator of the speaker. The iterative replacement of the center filter frequency contours (chosen locally as linear chirp) of the non-stationary bandpass filters converges extremely fast and leads to the extraction of filter-stable part-tones with uncorrupted phases. In contrast to phases of frequency decomposition with a priori defined, constant filter frequencies, the phase differences of filter-stable part-tones promise to become a useful supplement of the amplitude based acoustic features used for conventional automatic speech recognition. The derived phase features are tested in vowel classification experiments based on the phonetically rich TIMIT database.

关键词： Speech Harmonic analysis Equations Time frequency analysis Mel frequency cepstral coefficient Speech processing

来源：评论

学校读者我要写书评

暂无评论

Conditional leaving-one-out and cross-validation for discount estimation in Kneser-Ney-like extensions

Conditional leaving-one-out and cross-validation for discoun...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： J. Andrés-Ferrer M. Sundermeyer H. Ney Pattern Recognition and Human Language Technology Universidad Politécnica de Valencia Spain Human Language Technology and Pattern Recognition RWTH Aachen University Germany

The smoothing of n-gram models is a core technique in language modelling (LM). Modified Kneser-Ney (mKN) ranges among one of the best smoothing techniques. This technique discounts a fixed quantity from the observed counts in order to approximate the Turing-Good (TG) counts. Despite the TG counts optimise the leaving-one-out (L1O) criterion, the discounting parameters introduced in mKN do not. Moreover, the approximation to the TG counts for large counts is heavily simplified. In this work, both ideas are addressed: the estimation of the discounting parameters by L1O and better functional forms to approximate larger TG counts. The L1O performance is compared with cross-validation (CV) and mKN baseline in two large vocabulary tasks.

关键词： Smoothing methods Approximation methods Standards Optimization Estimation Computational modeling Training

来源：评论

学校读者我要写书评

暂无评论

Leave-One-Out Phrase Model Training for Large-Scale Deployment 12

Leave-One-Out Phrase Model Training for Large-Scale Deployme...

引用

Workshop on Statistical Machine Translation

作者： Joern Wuebker Mei-Yuh Hwang Chris Quirk Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany Microsoft Corporation Redmond WA USA

ISBN: (纸本)9781622765928

Training the phrase table by force-aligning (FA) the training data with the reference translation has been shown to improve the phrasal translation quality while significantly reducing the phrase table size on medium sized tasks. We apply this procedure to several large-scale tasks, with the primary goal of reducing model sizes without sacrificing translation quality. To deal with the noise in the automatically crawled parallel training data, we introduce on-demand word deletions, insertions, and backoffs to achieve over 99% successful alignment rate. We also add heuristics to avoid any increase in OOV rates. We are able to reduce already heavily pruned baseline phrase tables by more than 50% with little to no degradation in quality and occasionally slight improvement, without any increase in OOVs. We further introduce two global scaling factors for re-estimation of the phrase table via posterior phrase alignment probabilities and a modified absolute discounting method that can be applied to fractional counts.

关键词： reduced mass Model trains Heuristics Tables

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of Neural Networks in combination with n-gram language models

Performance analysis of Neural Networks in combination with ...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Ilya Oparin Martin Sundermeyer Hermann Ney Jean-Luc Gauvain LIMSI CNRS Spoken Language Processing Group France Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University Germany

Neural Network language models (NNLMs) have recently become an important complement to conventional n-gram language models (LMs) in speech-to-text systems. However, little is known about the behavior of NNLMs. The analysis presented in this paper aims to understand which types of events are better modeled by NNLMs as compared to n-gram LMs, in what cases improvements are most substantial and why this is the case. Such an analysis is important to take further benefit from NNLMs used in combination with conventional n-gram models. The analysis is carried out for different types of neural network (feed-forward and recurrent) LMs. The results showing for which type of events NNLMs provide better probability estimates are validated on two setups that are different in their size and the degree of data homogeneity.

关键词： Artificial neural networks History Analytical models Training data Vocabulary Interpolation

来源：评论

学校读者我要写书评

暂无评论

Skin-color based videos categorization

引用

International Journal of Computer Science Issues 2012年第1 1-3期9卷 473-477页

作者： Khan, Rehanullah Maqsood, Asad Khan, Zeeshan Ishaq, Muhammad Arif, Arsalan Sarhad University of Science and Information Technology Peshawar Pakistan RWTH Aachen Human Language Technology and Pattern Recognition Peshawar Pakistan UET Mardan Peshawar Pakistan

On dedicated websites, people can upload videos and share it with the rest of the world. Currently these videos are categorized manually by the help of the user community. In this paper, we propose a combination of color spaces with the Bayesian network approach for robust detection of skin color followed by an automated video categorization. Experimental results show that our method can achieve satisfactory performance for categorizing videos based on skin color. © 2012 International Journal of Computer Science Issues.

关键词： Bayesian networks

来源：评论

学校读者我要写书评

暂无评论

Mobile music modeling, analysis and recognition

Mobile music modeling, analysis and recognition

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Pavel Golik Boulos Harb Ananya Misra Michael Riley Alex Rudnick Eugene Weinstein Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany Google Inc. New York NY USA School of Informatics and Computing Indiana University Bloomington IN USA

We present an analysis of music modeling and recognition techniques in the context of mobile music matching, substantially improving on the techniques presented in [1]. We accomplish this by adapting the features specifically to this task, and by introducing new modeling techniques that enable using a corpus of noisy and channel-distorted data to improve mobile music recognition quality. We report the results of an extensive empirical investigation of the system's robustness under realistic channel effects and distortions. We show an improvement of recognition accuracy by explicit duration modeling of music phonemes and by integrating the expected noise environment into the training process. Finally, we propose the use of frame-to-phoneme alignment for high-level structure analysis of polyphonic music.

关键词： Training Accuracy Hidden Markov models Music Speech recognition USA Councils

来源：评论

学校读者我要写书评

暂无评论

Lexicon Models for Hierarchical Phrase-Based Machine Translation 8

Lexicon Models for Hierarchical Phrase-Based Machine Transla...

引用

8th International Workshop on Spoken language Translation, IWSLT 2011

作者： Huck, Matthias Mansour, Saab Wiesler, Simon Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

In this paper, we investigate lexicon models for hierarchical phrase-based statistical machine translation. We study five types of lexicon models: a model which is extracted from word-aligned training data and-given the word alignment matrix-relies on pure relative frequencies [1];the IBM model 1 lexicon [2];a regularized version of IBM model 1;a triplet lexicon model variant [3];and a discriminatively trained word lexicon model [4]. We explore source-to-target models with phrase-level as well as sentence-level scoring and target-to-source models with scoring on phrase level only. For the first two types of lexicon models, we compare several scoring variants. All models are used during search, i.e. they are incorporated directly into the log-linear model combination of the decoder. Phrase table smoothing with triplet lexicon models and with discriminative word lexicons are novel contributions. We also propose a new regularization technique for IBM model 1 by means of the Kullback-Leibler divergence with the empirical unigram distribution as regularization term. Experiments are carried out on the large-scale NIST Chinese→English translation task and on the English→French and Arabic→English IWSLT TED tasks. For Chinese→English and English→French, we obtain the best results by using the discriminative word lexicon to smooth our phrase tables. © IWSLT 2011. All rights reserved.

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

The rwth aachen Machine Translation System for IWSLT 2011 8

The RWTH Aachen Machine Translation System for IWSLT 2011

引用

8th International Workshop on Spoken language Translation, IWSLT 2011

作者： Wuebker, Joern Huck, Matthias Mansour, Saab Freitag, Markus Feng, Minwei Peitz, Stephan Schmidt, Christoph Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University Aachen Germany

In this paper the statistical machine translation (SMT) systems of rwth aachen university developed for the evaluation campaign of the International Workshop on Spoken language Translation (IWSLT) 2011 is presented. We participated in the MT (English-French, Arabic-English, Chinese-English) and SLT (English-French) tracks. Both hierarchical and phrase-based SMT decoders are applied. A number of different techniques are evaluated, including domain adaptation via monolingual and bilingual data selection, phrase training, different lexical smoothing methods, additional reordering models for the hierarchical system, various Arabic and Chinese segmentation methods, punctuation prediction for speech recognition output, and system combination. By application of these methods we can show considerable improvements over the respective baseline systems. © IWSLT 2011. All rights reserved.

关键词： Hierarchical systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：