检索结果-内蒙古大学图书馆

A grammatical approach to understanding textual tables using two-dimensional SCFGs 21

A grammatical approach to understanding textual tables using...

21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, COLING/ACL 2006

作者： Wu, Dekai Lee, Ken Wing Kuen Human Language Technology Center HKUST Department of Computer Science and Engineering University of Science and Technology Clear Water Bay Hong Kong

We present an elegant and extensible model that is capable of providing semantic interpretations for an unusually wide range of textual tables in documents. Unlike the few existing table analysis models, which largely rely on relatively ad hoc heuristics, our linguistically-oriented approach is systematic and grammar based, which allows our model (1) to be concise and yet (2) recognize a wider range of data models than others, and (3) disambiguate to a significantly finer extent the underlying semantic interpretation of the table in terms of data models drawn from relation database theory. To accomplish this, the model introduces Viterbi parsing under two-dimensional stochastic CFGs. The cleaner grammatical approach facilitates not only greater coverage, but also grammar extension and maintenance, as well as a more direct and declarative link to semantic interpretation, for which we also introduce a new, cleaner data model. In disambiguation experiments on recognizing relevant data models of unseen web tables from different domains, a blind evaluation of the model showed 60% precision and 80% recall. © 2006 Association for Computational Linguistics

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

Performing aggregation and ellipsis using discourse structures

引用

Research on language and Computation 2006年第4期4卷 353-375页

作者： Theune, Mariët Hielkema, Feikje Hendriks, Petra Human Media Interaction Department of Computer Science University of Twente Enschede Netherlands Department of Computing Science University of Aberdeen Aberdeen United Kingdom Center for Language and Cognition Groningen University of Groningen Groningen Netherlands

This article describes the generation of aggregated and elliptic sentences, using Dependency Trees connected by rhetorical relations as input. The system we have developed can generate both hypotactic and paratactic constructions with appropriate cue words, and various forms of ellipsis such as Gapping and Conjunction Reduction. We contend that Dependency Trees connected by rhetorical relations are excellent input for a generation system that has to generate ellipsis, and we propose a taxonomy of the most common Dutch cue words, grouped according to the kind of discourse relations they signal. Finally, we argue that syntactic aggregation should be performed in the Surface Realizer of a language generation system, because it requires access to language-specific syntactic information. © Springer science+Business Media 2007.

关键词： Aggregation Dependency trees Discourse structure Ellipsis language generation

来源：评论

学校读者我要写书评

暂无评论

Boosting for Chinese named entity recognition 5

Boosting for Chinese named entity recognition

引用

5th SIGHAN Workshop on Chinese language Processing, co-located with COLING/ACL 2006

作者： Yu, Xiaofeng Carpuat, Marine Wu, Dekai Human Language Technology Center HKUST Department of Computer Science and Engineering University of Science and Technology Clear Water Bay Hong Kong Hong Kong

ISBN: (纸本)1932432701

We report an experiment in which a high-performance boosting based NER model originally designed for multiple European languages is instead applied to the Chinese named entity recognition task of the third SIGHAN Chinese language processing bakeoff. Using a simple character-based model along with a set of features that are easily obtained from the Chinese input strings, the system described employs boosting, a promising and theoretically well-founded machine learning method to combine a set of weak classifiers together into a final system. Even though we did no other Chinese-specific tuning, and used only one-third of the MSRA and CityU corpora to train the system, reasonable results are obtained. Our evaluation results show that 75.07 and 80.51 overall F-measures were obtained on MSRA and CityU test sets respectively. © 2006 Association for Computational Linguistics.

关键词： Adaptive boosting

来源：评论

学校读者我要写书评

暂无评论

BESTCUT: A graph algorithm for coreference resolution

BESTCUT: A graph algorithm for coreference resolution

引用

11th Conference on Empirical Methods in Natural language Proceessing, EMNLP 2006, Held in Conjunction with COLING/ACL 2006

作者： Nicolae, Cristina Nicolae, Gabriel Human Language Technology Research Institute Department of Computer Science University of Texas at Dallas Richardson TX 75083-0688 United States

ISBN: (纸本)1932432736

In this paper we describe a coreference resolution method that employs a classification and a clusterization phase. In a novel way, the clusterization is produced as a graph cutting algorithm, in which nodes of the graph correspond to the mentions of the text, whereas the edges of the graph constitute the confidences derived from the coreference classification. In experiments, the graph cutting algorithm for coreference resolution, called BESTCUT, achieves state-of-the-art performance. © 2006 Association for Computational Linguistics.

关键词： Graphic methods

来源：评论

学校读者我要写书评

暂无评论

A quality evaluation technique of RFID middleware in ubiquitous computing

A quality evaluation technique of RFID middleware in ubiquit...

引用

2006 International Conference on Hybrid Information Technology, ICHIT 2006

作者： Gioug, Oh Dooyeon, Kim Sangil, Kim Sungyul, Rhew Department of Computer Information Kangwon Tourism College Korea Republic of Ministry of Education and Human Resource Development Seoul Korea Republic of Department of Computer Science Soongsil University Korea Republic of

With ubiquitous computing system, users can access information through a computer network at any time and in any place. The basic infrastructure of ubiquitous computing system is wireless network environment, and a RFID (Radio Frequency Identification) system is composed tags, readers, middleware, application services, etc. and uses networks. RFID middleware is system software that collects a large volume of raw data generated in RFID environment, filters the data, summarizes them into meaningful information and delivers the information to application services. RFID middleware links hardware to conventional middleware. Previous researches on RFID middleware have covered middleware from SUN, SAP, IBM, Microsoft, Oracle, etc. These products attach importance to different quality characteristics, and there have been few researches on the quality properties of RFID middleware. The present study examined functionality, reliability, usability, efficiency and portability among the quality characteristics of software in international standard ISO/IEC 9126 as well as the quality elements of standard RFID middleware of EPC Global, and based on them we extracted and analyzed items for evaluating the quality of RFID middleware in ubiquitous computing systems. Using the AHP (Analytic hierarchy process) that enables rational decision making by simplifying complicated problems, we evaluated the subjective characteristics of stakeholders in an objective way and proposed a selection method that evaluates quality using quality evaluation criteria. The proposed evaluation selection method is useful for developers who are going to develop RFID middleware in areas such as distribution and logistics to select RFID middleware suitable for their environment. © 2003 IEEE.

关键词： Middleware

来源：评论

学校读者我要写书评

暂无评论

The RWTH Statistical Machine Translation System for the IWSLT 2006 Evaluation 3

The RWTH Statistical Machine Translation System for the IWSL...

引用

3rd International Workshop on Spoken language Translation, IWSLT 2006

作者： Mauser, Arne Zens, Richard Matusov, Evgeny Hasan, Saša Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 Computer Science Department RWTH Aachen University AachenD-52056 Germany

We give an overview of the RWTH phrase-based statistical machine translation system that was used in the evaluation campaign of the International Workshop on Spoken language Translation (IWSLT) 2006. The system was ranked first with respect to the BLEU measure in all language pairs it was used Using a two-pass aproach, we first generate the N best translation candidates. The second pass consists of rescoring and reranking these candidates. We will give a description of the search algorithm as well as of the models used in each pass. We will also describe our method for dealing with punctuation restoration, in order to overcome the difficulties of spoken language translation. This work also includes a brief description of the system combination done by the partners participating in the European TC-Star project. © 2006 International Workshop on Spoken language Translation, IWSLT 2006. All rights reserved.

关键词： computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

N-Gram posterior probabilities for statistical machine translation

N-Gram posterior probabilities for statistical machine trans...

引用

2006 Workshop on Statistical Machine Translation, WMT 2006, collocated with the HLT-NAACL 2006

作者： Zens, Richard Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 - Computer Science Department RWTH Aachen University AachenD-52056 Germany

Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior probabilities and show how these can be used to improve translation quality. Additionally, we will introduce a sentence length model based on posterior probabilities. We will show significant improvements on the Chinese-English NIST task. The absolute improvements of the BLEU score is between 1.1% and 1.6%. © HLT-NAACL *** right reserved.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

Discriminative reordering models for statistical machine translation

Discriminative reordering models for statistical machine tra...

引用

2006 Workshop on Statistical Machine Translation, WMT 2006, collocated with the HLT-NAACL 2006

作者： Zens, Richard Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 Computer Science Department RWTH Aachen University AachenD-52056 Germany

We present discriminative reordering models for phrase-based statistical machine translation. The models are trained using the maximum entropy principle. We use several types of features: based on words, based on word classes, based on the local context. We evaluate the overall performance of the reordering models as well as the contribution of the individual feature types on a word-aligned corpus. Additionally, we show improved translation performance using these reordering models compared to a state-of-the-art baseline system. © HLT-NAACL *** right reserved.

关键词： computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Automatic learning of Chinese English semantic structure mapping

Automatic learning of Chinese English semantic structure map...

引用

IEEE Spoken language Technology Workshop

作者： Pascale Fung Wu Zhaojun Yang Yongsheng Dekai Wu Human Language Technology Center Department of Electronic and Computer EngineeringDepartment of Computer Science and Engineering University of Science and Technology (HKUST) Hong Kong China

We present twin results on Chinese semantic parsing, with application to English-Chinese cross- lingual verb frame acquisition. First, we describe two new state-of-the-art Chinese shallow semantic parsers leading to an F-score of 82.01 on simultaneous frame and argument boundary identification and labeling. Subsequently, we propose a model that applies the separate Chinese and English semantic parsers to learn cross-lingual semantic verb frame argument mappings with 89.3% accuracy. The only training data needed by this cross-lingual learning model is a pair of non-parallel monolingual Propbanks, plus an unannotated parallel corpus. We also present the first reported controlled comparison of maximum entropy and SVM approaches to shallow semantic parsing, using the Chinese data.

关键词： Natural languages Labeling Training data Error correction US Government humans computer science Application software Entropy Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Inversion transduction grammar coverage of arabic-english word alignment for tree-structured statistical machine translation

Inversion transduction grammar coverage of arabic-english wo...

引用

IEEE Spoken language Technology Workshop

作者： Dekai Wu Marine Carpuat Yihai Shen HKUST Department of Computer Science and Engineering Human Language Technology Center Hong Kong China

We present the first known direct measurement of word alignment coverage on an Arabic-English parallel corpus using inversion transduction grammar constraints. While direct measurements have been reported for several European and Asian languages, to date no results have been available for Arabic or any Semitic language despite much recent activity on Arabic- English spoken language and text translation. Many recent syntax based statistical MT models operate within the domain of ITG expressiveness, often for efficiency reasons, so it has become important to determine the extent to which the ITG constraint assumption holds. Our results on Arabic provide further evidence that ITG expressiveness appears largely sufficient for core MT models.

关键词： Natural languages Decoding Context modeling Hidden Markov models Error analysis humans Marine technology computer science Oral communication Formal languages

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：