检索结果-内蒙古大学图书馆

Skin-color based videos categorization

International Journal of Computer Science Issues 2012年第1 1-3期9卷 473-477页

作者： Khan, Rehanullah Maqsood, Asad Khan, Zeeshan Ishaq, Muhammad Arif, Arsalan Sarhad University of Science and Information Technology Peshawar Pakistan RWTH Aachen Human Language Technology and Pattern Recognition Peshawar Pakistan UET Mardan Peshawar Pakistan

On dedicated websites, people can upload videos and share it with the rest of the world. Currently these videos are categorized manually by the help of the user community. In this paper, we propose a combination of color spaces with the Bayesian network approach for robust detection of skin color followed by an automated video categorization. Experimental results show that our method can achieve satisfactory performance for categorizing videos based on skin color. © 2012 International Journal of Computer Science Issues.

关键词： Bayesian networks

来源：评论

学校读者我要写书评

暂无评论

Mobile music modeling, analysis and recognition

Mobile music modeling, analysis and recognition

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Pavel Golik Boulos Harb Ananya Misra Michael Riley Alex Rudnick Eugene Weinstein Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany Google Inc. New York NY USA School of Informatics and Computing Indiana University Bloomington IN USA

We present an analysis of music modeling and recognition techniques in the context of mobile music matching, substantially improving on the techniques presented in [1]. We accomplish this by adapting the features specifically to this task, and by introducing new modeling techniques that enable using a corpus of noisy and channel-distorted data to improve mobile music recognition quality. We report the results of an extensive empirical investigation of the system's robustness under realistic channel effects and distortions. We show an improvement of recognition accuracy by explicit duration modeling of music phonemes and by integrating the expected noise environment into the training process. Finally, we propose the use of frame-to-phoneme alignment for high-level structure analysis of polyphonic music.

关键词： Training Accuracy Hidden Markov models Music Speech recognition USA Councils

来源：评论

学校读者我要写书评

暂无评论

Lexicon Models for Hierarchical Phrase-Based Machine Translation 8

Lexicon Models for Hierarchical Phrase-Based Machine Transla...

引用

8th International Workshop on Spoken language Translation, IWSLT 2011

作者： Huck, Matthias Mansour, Saab Wiesler, Simon Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

In this paper, we investigate lexicon models for hierarchical phrase-based statistical machine translation. We study five types of lexicon models: a model which is extracted from word-aligned training data and-given the word alignment matrix-relies on pure relative frequencies [1];the IBM model 1 lexicon [2];a regularized version of IBM model 1;a triplet lexicon model variant [3];and a discriminatively trained word lexicon model [4]. We explore source-to-target models with phrase-level as well as sentence-level scoring and target-to-source models with scoring on phrase level only. For the first two types of lexicon models, we compare several scoring variants. All models are used during search, i.e. they are incorporated directly into the log-linear model combination of the decoder. Phrase table smoothing with triplet lexicon models and with discriminative word lexicons are novel contributions. We also propose a new regularization technique for IBM model 1 by means of the Kullback-Leibler divergence with the empirical unigram distribution as regularization term. Experiments are carried out on the large-scale NIST Chinese→English translation task and on the English→French and Arabic→English IWSLT TED tasks. For Chinese→English and English→French, we obtain the best results by using the discriminative word lexicon to smooth our phrase tables. © IWSLT 2011. All rights reserved.

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

The RWTH 2010 Quaero ASR evaluation system for English, French, and German

The RWTH 2010 Quaero ASR evaluation system for English, Fren...

引用

36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011

作者： Sundermeyer, M. Nussbaum-Thom, M. Wiesler, S. Plahl, C. El-Desoky Mousa, A. Hahn, S. Nolden, D. Schlüter, R. Ney, H. Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Germany

ISBN: (纸本)9781457705397

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

A convergence analysis of log-linear training and its application to speech recognition

A convergence analysis of log-linear training and its applic...

引用

2011 IEEE Workshop on Automatic Speech recognition and Understanding, ASRU 2011

作者： Wiesler, S. Schluter, R. Ney, H. Human Language Technology and Pattern Recognition RWTH Aachen University of Technology 52056 Aachen Germany

ISBN: (纸本)9781467303675

Log-linear models are a promising approach for speech recognition. Typically, log-linear models are trained according to a strictly convex criterion. Optimization algorithms are guaranteed to converge to the unique global optimum of the objective function from any initialization. For large-scale applications, considerations in the limit of infinite iterations are not sufficient. We show that log-linear training can be a highly ill-conditioned optimization problem, resulting in extremely slow convergence. Conversely, the optimization problem can be preconditioned by feature transformations. Making use of our convergence analysis, we improve our log-linear speech recognition system and achieve a strong reduction of its training time. In addition, we validate our analysis on a continuous handwriting recognition task. © 2011 IEEE.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Modeling Punctuation Prediction as Machine Translation 8

Modeling Punctuation Prediction as Machine Translation

引用

8th International Workshop on Spoken language Translation, IWSLT 2011

作者： Peitz, Stephan Freitag, Markus Mauser, Arne Ney, Hermann Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

Punctuation prediction is an important task in Spoken language Translation. The output of speech recognition systems does not typically contain punctuation marks. In this paper we analyze different methods for punctuation prediction and show improvements in the quality of the final translation output. In our experiments we compare the different approaches and show improvements of up to 0.8 BLEU points on the IWSLT 2011 English French Speech Translation of Talks task using a translation system to translate from unpunctuated to punctuated text instead of a language model based punctuation prediction method. Furthermore, we do a system combination of the hypotheses of all our different approaches and get an additional improvement of 0.4 points in BLEU. © IWSLT 2011. All rights reserved.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Combining Translation and language Model Scoring for Domain-Specific Data Filtering 8

Combining Translation and Language Model Scoring for Domain-...

引用

8th International Workshop on Spoken language Translation, IWSLT 2011

作者： Mansour, Saab Wuebker, Joern Ney, Hermann Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

The increasing popularity of statistical machine translation (SMT) systems is introducing new domains of translation that need to be tackled. As many resources are already available, domain adaptation methods can be applied to utilize these recourses in the most beneficial way for the new domain. We explore adaptation via filtering, using the cross-entropy scores to discard irrelevant sentences. We focus on filtering for two important components of an SMT system, namely the language model (LM) and the translation model (TM). Previous work has already applied LM cross-entropy based scoring for filtering. We argue that LM cross-entropy might be appropriate for LM filtering, but not as much for TM filtering. We develop a novel filtering approach based on a combined TM and LM cross-entropy scores. We experiment with two large-scale translation tasks, the Arabic-to-English and English-to-French IWSLT 2011 TED Talks MT tasks. For LM filtering, we achieve strong perplexity improvements which carry over to the translation quality with improvements up to +0.4% BLEU. For TM filtering, the combined method achieves small but consistent improvements over the standalone methods. As a side effect of adaptation via filtering, the fully fledged SMT system vocabulary size and phrase table size are reduced by a factor of at least 2 while up to +0.6% BLEU improvement is observed. © IWSLT 2011. All rights reserved.

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Using morpheme and syllable based sub-words for polish LVCSR

Using morpheme and syllable based sub-words for polish LVCSR

引用

36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011

作者： Shaik, M. Ali Basha El-Desoky Mousa, Amr Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition - Computer Science Department RWTH Aachen University 52056 Aachen Germany

ISBN: (纸本)9781457705397

Polish is a synthetic language with a high morpheme-per-word ratio. It makes use of a high degree of inflection leading to high out-of-vocabulary (OOV) rates, and high language Model (LM) perplexities. This poses a challenge for Large Vocabulary and Continuous Speech recognition (LVCSR) systems. Here, the use of morpheme and syllable based units is investigated for building sub-lexical LMs. A different type of sub-lexical units is proposed based on combining morphemic or syllabic units with corresponding pronunciations. Thereby, a set of grapheme-phoneme pairs called graphones are used for building LMs. A relative reduction of 3.5% in Word Error Rate (WER) is obtained with respect to a traditional system based on full-words. © 2011 IEEE.

关键词： Continuous speech recognition

来源：评论

学校读者我要写书评

暂无评论

Hybrid language models using mixed types of sub-lexical units for open vocabulary German LVCSR

Hybrid language models using mixed types of sub-lexical unit...

引用

12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011

作者： Ali Basha Shaik, M. El-Desoky Mousa, Amr Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University 52056 Aachen Germany

German is a highly inflected language with a large number of words derived from the same root. It makes use of a high degree of word compounding leading to high Out-of-vocabulary (OOV) rates, and language Model (LM) perplexities. For such languages the use of sub-lexical units for Large Vocabulary Continuous Speech recognition (LVCSR) becomes a natural choice. In this paper, we investigate the use of mixed types of sub-lexical units in the same recognition lexicon. Namely, morphemic or syllabic units combined with pronunciations called graphones, normal graphemic morphemes or syllables along with full-words. This mixture of units is used for building hybrid LMs suitable for open vocabulary LVCSR where the system operates over an open, constantly changing vocabulary like in broadcast news, political debates, etc. A relative reduction of around 5.0% in Word Error Rate (WER) is obtained compared to a traditional full-words system. Moreover, around 40% of the OOVs are recognized. Copyright © 2011 ISCA.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Morpheme based Factored language Models for German LVCSR

Morpheme based Factored Language Models for German LVCSR

引用

12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011

作者： El-Desoky Mousa, Amr Ali Basha Shaik, M. Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University 52056 Aachen Germany

German is a highly inflectional language, where a large number of words can be generated from the same root. It makes a liberal use of compounding leading to high Out-of-vocabulary (OOV) rates, and poor language Model (LM) probability estimates. Therefore, the use of morphemes for language modeling is considered a better choice for Large Vocabulary Continuous Speech recognition (LVCSR) than the full-words. Thereby, better lexical coverage and less LM perplexities are achieved. On the other side, the use of Factored language Models (FLMs) is considered a successful approach that allows the integration of many information sources to get better LM probability estimates. In this paper, we try a combined methodology for language modeling where both morphological decomposition and factored language modeling are used in one model called morpheme based FLM. Finally, we obtain around 2.5% relative reduction in Word Error Rate (WER) with respect to a traditional full-words system. Copyright © 2011 ISCA.

关键词： Modeling languages

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：