检索结果-内蒙古大学图书馆

PACLIC 25 - proceedings of the 25th Pacific Asia Conference on language, Information and Computation 2011年 11-19页

作者： Kwong, Oi Yee Department of Chinese Translation and Linguistics City University of Hong Kong Tat Chee Avenue Kowloon Hong Kong

ISBN: (纸本)9784905166023

In this paper, we propose a simple and intuitive yet linguistically and practically motivated method for English-Chinese name transliteration generation. Our system is essentially a syllable-based Maximum Matching system. It uses the Onset First Principle to syllabify English names and align them with Chinese names. The bilingual lexicon containing aligned segments of various syllable lengths subsequently allows direct transliteration by chunks. The proposed method was tested on the data from the shared task of the Named Entities workshop 2009. The results suggest that Forward Maximum Matching performed slightly better than Backward Maximum Matching, but when used together much better results comparable to those of state-of-the-art methods could be attained. © 2011 by Oi Yee Kwong.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

graph-based clustering for computational linguistics: A survey 48

Graph-based clustering for computational linguistics: A surv...

引用

5th workshop on graph-based methods for natural language processing, Textgraphs 2010

作者： Chen, Zheng Ji, Heng Graduate Center City University of New York United States Graduate Center Queens College City University of New York United States

ISBN: (纸本)1932432779

In this survey we overview graph-based clustering and its applications in computational linguistics. We summarize graph-based clustering as a five-part story: hypothesis, modeling, measure, algorithm and evaluation. We then survey three typical NLP problems in which graph-based clustering approaches have been successfully applied. Finally, we comment on the strengths and weaknesses of graph-based clustering and envision that graph-based clustering is a promising solution for some emerging NLP problems. © 2010 The Association for Computational Linguistics.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Robust and efficient page rank for word sense disambiguation 48

Robust and efficient page rank for word sense disambiguation

引用

5th workshop on graph-based methods for natural language processing, Textgraphs 2010

作者： De Cao, Diego Basili, Roberto Luciani, Matteo Mesiano, Francesco Rossi, Riccardo Dept. of Computer Science University of Roma Tor Vergata Rome Italy

ISBN: (纸本)1932432779

graph-based methods that are en vogue in the social network analysis area, such as centrality models, have been recently applied to linguistic knowledge bases, including unsupervised Word Sense Disambiguation. Although the achievable accuracy is rather high, the main drawback of these methods is the high computational demanding whenever applied to the large scale sense repositories. In this paper an adaptation of the PageRank algorithm recently proposed for Word Sense Disambiguation is presented that preserves the reachable accuracy while significantly reducing the requested processing time. Experimental analysis over well-known benchmarks will be presented in the paper and the results confirm our hypothesis. © 2010 The Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

Co-occurrence cluster features for lexical substitutions in context 48

Co-occurrence cluster features for lexical substitutions in ...

引用

5th workshop on graph-based methods for natural language processing, Textgraphs 2010

作者： Biemann, Chris 475 Brannan St Ste. 330 San Francisco CA 94107 United States

ISBN: (纸本)1932432779

This paper examines the influence of features based on clusters of co-occurrences for supervised Word Sense Disambiguation and Lexical Substitution. Cooccurrence cluster features are derived from clustering the local neighborhood of a target word in a co-occurrence graph based on a corpus in a completely unsupervised fashion. Clusters can be assigned in context and are used as features in a supervised WSD system. Experiments fitting a strong baseline system with these additional features are conducted on two datasets, showing improvements. Cooccurrence features are a simple way to mimic Topic Signatures (Mart´inez et al., 2008) without needing to construct resources manually. Further, a system is described that produces lexical substitutions in context with very high precision. © 2010 The Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

An investigation on the influence of frequency on the lexical organization of verbs 48

An investigation on the influence of frequency on the lexica...

引用

5th workshop on graph-based methods for natural language processing, Textgraphs 2010

作者： Germann, Daniel Cerato Villavicencio, Aline Siqueira, Maity Institute of Informatics Federal University of Rio Grande do Sul Brazil Department of Computer Sciences Bath University United Kingdom Institute of Language Studies Federal University of Rio Grande do Sul Brazil

ISBN: (纸本)1932432779

This work extends the study of Germann et al. (2010) in investigating the lexical organization of verbs. Particularly, we look at the influence of frequency on the process of lexical acquis ition and use. We examine data obtained from psycholinguistic action naming tasks performed by children and adults (speakers of Brazilian Portuguese), and analyze some characteristics of the verbs used by each group in terms of similarity of content, using Jaccard?s coefficient, and of topology, using graph theory. The experiments suggest that younger children tend to use more frequent verbs than adults to describe events in the world. © 2010 The Association for Computational Linguistics.

关键词： graph theory

来源：评论

学校读者我要写书评

暂无评论

Finite-State methods and natural language processing - Post-proceedings of the 7th International workshop FSMNLP 2008 - Volume 191 Frontiers in Artificial Intelligence and Applications

引用

丛书名： Frontiers in Artificial Intelligence and Applications 191

2009年

作者： Jakub Piskorski Bruce William Watson Anssi Yli-Jyr?

ISBN: (纸本)9781586039752

These proceedings contain the final versions of the papers presented at the 7th International workshop on Finite-State methods and natural language processing (FSMNLP), held in Ispra, Italy, on September 1112, 2008. The aim of the FSMNLP workshops is to bring together members of the research and industrial community working on finite-state based models in language technology, computational linguistics, web mining, linguistics and cognitive science on one hand, and on related theory and methods in fields such as computer science and mathematics on the other. Thus, the workshop series is a forum for researchers and practitioners working on applications as well as theoretical and implementation aspects. The special theme of FSMNLP 2008 was high performance finite-state devices in large-scale natural language text processing systems and applications. The papers in this publication cover a range of interesting NLP applications, including machine learning and translation, logic, computational phonology, morphology and semantics, data mining, information extraction and disambiguation, as well as programming, optimization and compression of finite-state networks. The applied methods include weighted algorithms, kernels and tree automata. In addition, relevant aspects of software engineering, standardization and European funding programs are *** Press is an international science, technical and medical publisher of high-quality books for academics, scientists, and professionals in all fields. Some of the areas we publish in: -Biomedicine -Oncology -Artificial intelligence -Databases and information systems -Maritime engineering -Nanotechnology -Geoengineering -All aspects of physics -E-governance -E-commerce -The knowledge economy -Urban studies -Arms control -Understanding and responding to terrorism -Medical informatics -Computer Sciences

关键词： natural language processing Speech Recognition/Synthesis

来源：评论

学校读者我要写书评

暂无评论

Efficient graph-based semi-supervised learning of structured tagging models

Efficient graph-based semi-supervised learning of structured...

引用

Conference on Empirical methods in natural language processing, EMNLP 2010

作者： Subramanya, Amarnag Petrov, Slav Pereira, Fernando Google Research Mountain View CA 94043 United States Google Research New York NY 10011 United States

ISBN: (纸本)1932432868

We describe a new scalable algorithm for semi-supervised training of conditional random fields (CRF) and its application to part-of-speech (POS) tagging. The algorithm uses a similarity graph to encourage similar n-grams to have similar POS tags. We demonstrate the efficacy of our approach on a domain adaptation task, where we assume that we have access to large amounts of unlabeled data from the target domain, but no additional labeled data. The similarity graph is used during training to smooth the state posteriors on the target domain. Standard inference can be used at test time. Our approach is able to scale to very large problems and yields significantly improved target domain accuracy. © 2010 Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

PLN-E 2010 - proceedings of the workshop NLP in the Enterprise: Envisioning the Next 10 Years, the workshop was Held at the 26th Spanish Conference on natural language processing, SEPLN 2010

PLN-E 2010 - Proceedings of the Workshop NLP in the Enterpri...

引用

workshop NLP in the Enterprise: Envisioning the Next 10 Years, PLN-E 2010 - Held at the 26th Spanish Conference on natural language processing, SEPLN 2010

The proceedings contain 13 papers. The topics discussed include: lexisla: a legislative information retrieval system;mOCRA: mobile OCR application;enterprise 2.0: plagiarism detection and opinion analysis;NLP techniques & the Internet: searching for opinions and automatic sentiments analysis;naturalOpinions: NLP-based opinion extraction in user-generated content;Babxel: multilingual search;mobile augmented information system;trust based recommendations for social media;approximate retrieval of postal addresses;personalized health information system;natural language processing interactive multimodal systems;an approach on improving search engines through social content recommendation;and towards natural language interaction.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Spectral methods for Thesaurus Construction

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2010年第6期E93D卷 1378-1385页

作者： Shimizu, Nobuyuki Sugiyama, Masashi Nakagawa, Hiroshi Univ Tokyo Ctr Informat Technol Tokyo 1130033 Japan Tokyo Inst Technol Dept Comp Sci Tokyo 1528550 Japan

Traditionally, popular synonym acquisition methods are based on the distributional hypothesis, and a metric such as Jaccard coefficients is used to evaluate the similarity between the contexts of words to obtain synonyms for a query. On the other hand, when one tries to compile and clean a thesaurus, one often already has a modest number of synonym relations at hand. Could something be done with a half-built thesaurus alone? We propose the use of spectral methods and discuss their relation to other network-based algorithms in natural language processing (NLP), such as Page Rank and Bootstrapping. Since compiling a thesaurus is very laborious, we believe that adding the proposed method to the toolkit of thesaurus constructors would significantly ease the pain in accomplishing this task.

关键词： synonym acquisition synonym extraction thesaurus spectral clustering graph laplacian

来源：评论

学校读者我要写书评

暂无评论

Excavating grey literature A case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources

引用

ASLIB proceedings 2010年第4-5期62卷 466-475页

作者： Vlachidis, Andreas Binding, Ceri Tudhope, Douglas May, Keith Univ Glamorgan Fac Adv Technol Hypermedia Res Unit Pontypridd CF37 1DL M Glam Wales English Heritage Portsmouth Hants England

Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts *** Appellation and *** Object.

关键词： Information management Semantics Data handling

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：