检索结果-内蒙古大学图书馆

IEEE Workshop on Automatic Speech Recognition and Understanding

作者： Luis Javier Rodríguez-Fuentes Mikel Penagarikano Amparo Varona Mireia Díez Germán Bordel David Martínez Jesús Villalba Antonio Miguel Alfonso Ortega Eduardo Lleida Alberto Abad Oscar Koller Isabel Trancoso Paula Lopez-Otero Laura Docio-Fernandez Carmen Garcia-Mateo Rahim Saeidi Mehdi Soufifar Tomi Kinnunen Torbjörn Svendsen Pasi Fränti GTTS Department of Electricity and Electronics University of Basque Country (UPV-EHU) Spain ViVo Laboratory Aragon Institute for Engineering Research (I3A) University of Zaragoza Spain L 2 F-Spoken Language Systems Laboratory INESC-ID Lisboa Portugal Instituto Superior Técnico Lisboa Portugal GTM Department of Signal Theory and Communications Universidade de Vigo Spain School of Computing University of Eastern Finland Joensuu Finland Department of Electronics and Telecommunications NTNU Trondheim Norway

Best language recognition performance is commonly obtained by fusing the scores of several heterogeneous systems. Regardless the fusion approach, it is assumed that different systems may contribute complementary information, either because they are developed on different datasets, or because they use different features or different modeling approaches. Most authors apply fusion as a final resource for improving performance based on an existing set of systems. Though relative performance gains decrease as larger sets of systems are considered, best performance is usually attained by fusing all the available systems, which may lead to high computational costs. In this paper, we aim to discover which technologies combine the best through fusion and to analyse the factors (data, features, modeling methodologies, etc.) that may explain such a good performance. Results are presented and discussed for a number of systems provided by the participating sites and the organizing team of the Albayzin 2010 language Recognition Evaluation. We hope the conclusions of this work help research groups make better decisions in developing language recognition technology.

关键词： Speech Acoustics Hidden Markov models Educational institutions Noise measurement Data models Calibration

来源：评论

学校读者我要写书评

暂无评论

A unified character-based tagging framework for chinese word segmentation

引用

ACM Transactions on Asian language Information Processing 2010年第2期9卷 1–32页

作者： Zhao, Hai Huang, Chang-Ning Li, Mu. Lu, Bao-Liang Department of Computer Science and Engineering MOE-Microsoft Key Laboratory for Intelligent Computing and Intelligent Systems Shanghai Jiao Tong University 800 Dongchuan Road Minhang District Shanghai 200240 China Language Computing Group Microsoft Research Asia 49 Zhichun Road Haidian District Beijing 100080 China

Chinese word segmentation is an active area in Chinese language processing though it is suffering from the argument about what precisely is a word in Chinese. Based on corpus-based segmentation standard, we launched this study. In detail, we regard Chinese word segmentation as a character-based tagging problem. We show that there has been a potent trend of using a character-based tagging approach in this field. In particular, learning from segmented corpus with or without additional linguistic resources is treated in a unified way in which the only difference depends on how the feature template set is selected. It differs from existing work in that both feature template selection and tag set selection are considered in our approach, instead of the previous feature template focus only technique. We show that there is a significant performance difference as different tag sets are selected. This is especially applied to a six-tag set, which is good enough for most current segmented corpora. The linguistic meaning of a tag set is also discussed. Our results show that a simple learning system with six n-gram feature templates and a six-tag set can obtain competitive performance in the cases of learning only from a training corpus. In cases when additional linguistic resources are available, an ensemble learning technique, assistant segmenter, is proposed and its effectiveness is verified. Assistant segmenter is also proven to be an effective method as segmentation standard adaptation that outperforms existing ones. Based on the proposed approach, our system provides state-of-the-art performance in all 12 corpora of three international Chinese word segmentation bakeoffs. © 2010 ACM 1530-0226/2010/06-ART5 $10.00.

关键词： Learning systems

来源：评论

学校读者我要写书评

暂无评论

Generalized Mongue-Elkan Method for Approximate Text String Comparison

Generalized Mongue-Elkan Method for Approximate Text String ...

引用

10th International Conference on Intelligent Text Processing and Computational Linguistics

作者： Jimenez, Sergio Becerra, Claudia Gelbukh, Alexander Gonzalez, Fabio Intelligent Systems Laboratory (LISI) Systems and Industrial Engineering Department National University of Colombia Colombia Natural Language Laboratory Center for Computing Research (CIC) National Polytechnic Institute (IPN) Mexico

ISBN: (纸本)9783642003813

The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token level (i.e. word level) similarity measure. We propose a generalization of this method based on the notion of the generalized arithmetic mean instead of the simple average used in the expression to calculate the Monge-Elkan method. The experiments carried out with 12 well-known name-matching data sets show that the proposed approach outperforms the original Monge-Elkan method when character-based measures are used to compare tokens.

关键词： Text processing

来源：评论

学校读者我要写书评

暂无评论

On some optimization heuristics for lesk-like WSD algorithms

On some optimization heuristics for lesk-like WSD algorithms

引用

10th International Conference on Applications of Natural language to Information Systems, NLDB 2005: Natural language Processing and Information Systems

作者： Gelbukh, Alexander Sidorov, Grigori Han, Sang-Yong Natural Language and Text Processing Laboratory Center for Computing Research National Polytechnic Institute 07738 Mexico Department of Computer Science and Engineering Chung-Ang University 221 Huksuk-Dong DongJak-Ku Seoul 156-756 Korea Republic of

For most English words, dictionaries give various senses: e.g., "bank" can stand for a financial institution, shore, set, etc. Automatic selection of the sense intended in a given text has crucial importance in many applications of text processing, such as information retrieval or machine translation: e.g., "(my account in the) bank" is to be translated into Spanish as "(mi cuenta en et) banco" whereas "(on the) bank (of the lake)" as "(en la) orilla (del logo)." To choose the optimal combination of the intended senses of all words, Lesk suggested to consider the global coherence of the text, i.e., which we mean the average relatedness between the chosen senses for all words in the text. Due to high dimensionality of the search space, heuristics are to be used to find a near-optimal configuration. In this paper, we discuss several such heuristics that differ in terms of complexity and quality of the results. In particular, we introduce a dimensionality reduction algorithm that reduces the complexity of computationally expensive approaches such as genetic algorithms. © Springer-Verlag Berlin Heidelberg 2005.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Automatic enrichment of a very large dictionary of word combinations on the basis of dependency formalism

引用

3rd Mexican International Conference on Artificial Intelligence, MICAI 2004

作者： Gelbukh, Alexander Sidorov, Grigori Han, San-Yong Hernández-Rubio, Erika Natural Language and Text Processing Laboratory Center for Computing Research National Polytechnic Institute Av. Juan Dios Batiz s/n Zacatenco Mexico City07738 Mexico Department of Computer Science and Engineering Chung-Ang University 221 Huksuk-Dong DongJak-Ku Seoul156-756 Korea Republic of

ISBN: (纸本)3540214593

The paper presents a method of automatic enrichment of a very large dictionary of word combinations. The method is based on results of automatic syntactic analysis (parsing) of sentences. The dependency formalism is used for representation of syntactic trees that allows for easier treatment of information about syntactic compatibility. Evaluation of the method is presented for the Spanish language based on comparison of the automatically generated results with manually marked word combinations. © Springer-Verlag Berlin Heidelberg 2004.

关键词： Syntactics

来源：评论

学校读者我要写书评

暂无评论

Handwritten sentence recognition: from signal to syntax

Handwritten sentence recognition: from signal to syntax

引用

International Conference on Pattern Recognition

作者： R. Plamondon S. Clergeau C. Barriere Electrical & Computer Engineering Department Ecole Polytechnique de Montreal Montreal QUE Canada Natural Language Laboratory Computing Science Department Simon Fraser University Canada Centre National d''Etudes Des Telecommunications Lannion France

This paper describes a system dedicated to online handwritten sentence recognition. The prototype is made up of two basic processors. The first controls the data acquisition, pentip trace segmentation, letter identification. The second aims at identifying and correcting the words candidates by integrating syntactic and lexical information. Sentences are parsed to list the grammatical classes of each incorrect candidates then lexical query searches for words in a lexicon according to grammatical classes. A final decision is made using a string comparison algorithm. Tests of the complete system are reported at the end for a typical writer-dependent application.

关键词： Handwriting recognition Prototypes System testing Performance evaluation Natural languages Laboratories Process control Data acquisition Electronic equipment testing Application software

来源：评论

学校读者我要写书评

暂无评论

Semantic Relations Between Nominals, Second Edition 2

引用

丛书名： Synthesis Lectures on Human language Technologies

1000年

作者： Vivi Nastase Stan Szpakowicz Preslav Nakov Diarmuid Ó Séagdha

ISBN: (数字)9783031021787

ISBN: (纸本)9783031010507

Opportunity and Curiosity find similar rocks on Mars. One can generally understand this statement if one knows that Opportunity and Curiosity are instances of the class of Mars rovers, and recognizes that, as signalled by the word on, rocks are located on Mars. Two mental operations contribute to understanding: recognize how entities/concepts mentioned in a text interact and recall already known facts (which often themselves consist of relations between entities/concepts). Concept interactions one identifies in the text can be added to the repository of known facts, and aid the processing of future texts. The amassed knowledge can assist many advanced language-processing tasks, including summarization, question answering and machine translation. Semantic relations are the connections we perceive between things which interact. The book explores two, now intertwined, threads in semantic relations: how they are expressed in texts and what role they play in knowledge repositories. A historical perspective takes us back more than 2000 years to their beginnings, and then to developments much closer to our time: various attempts at producing lists of semantic relations, necessary and sufficient to express the interaction between entities/concepts. A look at relations outside context, then in general texts, and then in texts in specialized domains, has gradually brought new insights, and led to essential adjustments in how the relations are seen. At the same time, datasets which encompass these phenomena have become available. They started small, then grew somewhat, then became truly large. The large resources are inevitably noisy because they are constructed automatically. The available corpora—to be analyzed, or used to gather relational evidence—have also grown, and some systems now operate at the Web scale. The learning of semantic relations has proceeded in parallel, in adherence to supervised, unsupervised or distantly supervised paradigms. Detailed analyses of annota

关键词： Artificial Intelligence Natural language Processing (NLP) Computational Linguistics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：