检索结果-内蒙古大学图书馆

1st workshop on High-Performance Computing for the Semantic Web 2011, HPCSW 2011 - Co-located with the 8th Extended Semantic Web Conference, ESWC 2011

作者： Assel, Matthias Cheptsov, Alexey Czink, Blasius Damljanovic, Danica Quesada, Jose HLRS - High Performance Computing Center Stuttgart University of Stuttgart Nobelstrasse 19 70569 Stuttgart Germany Department of Computer Science Natural Language Processing Group University of Sheffield Regent Court 211 Portobello S1 4DP Sheffield United Kingdom Center for Adaptive Behavior and Cognition Max Planck Institute for Human Development Lentzeallee 94 14195 Berlin Germany

With billions of triples in the Linked Open Data cloud, which continues to grow exponentially, very challenging tasks begin to emerge related to the exploitation of large-scale reasoning. A considerable amount of work has been done in the area of using Information Retrieval methods to address these problems. However, although applied models work on Web scale, they downgrade the semantics contained in an RDF graph by observing each physical resource as a 'bag of words (URIs/literals)'. Distributional statistic methods can address this problem by capturing the structure of the graph more efficiently. However, these methods are continually confronting with efficiency and scalability problems on serial computing architectures due to their computational complexity. In this paper, we describe a parallelization algorithm of one such method (Random Indexing) based on the Message-Passing Interface (MPI), that enables efficient utilization of high performance parallel computers. Our evaluation results show significant performance improvement.

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

The Potsdam NLG systems at the GIVE-2.5 Challenge

The Potsdam NLG systems at the GIVE-2.5 Challenge

引用

13th European workshop on natural language Generation, ENLG 2011

作者： Garoufi, Konstantina Koller, Alexander Area of Excellence Cognitive Sciences University of Potsdam Germany

We present the Potsdam natural language generation systems P1 and P2 of the GIVE-2.5 Challenge. The systems implement two different referring expression generation models from Garoufi and Koller (2011) while behaving identically in all other respects. In particular, P1 combines symbolic and corpus-based methods for the generation of successful referring expressions, while P2 is based on a purely symbolic model which serves as a qualified baseline for comparison. We describe how the systems operated in the challenge and discuss the results, which indicate that P1 outperforms P2 in terms of several measures of referring expression success. © 2011 Association for Computational Linguistics.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Disambiguation of medline abstracts using topic models

Disambiguation of medline abstracts using topic models

引用

ACM 5th International workshop on Data and Text Mining in Biomedical Informatics, DTMBIO'11, in Conjunction with the 20th ACM International Conference on Information and Knowledge Management, CIKM'11

作者： Stevenson, Mark Natural Language Processing Group Department of Computer Science Sheffield University Regent Court 211 Portobello Sheffield United Kingdom

ISBN: (纸本)9781450309608

Topic models are an established technique for generating information about the subjects discussed in collections of documents. Latent Dirichlet Allocation (LDA) is a widely applied topic model. The topic models generated by LDA consist of sets of terms associated with each topic and these are used to provide context for a Word Sense Disambiguation (WSD) system. It is found that using this context leads to a statistically significant improvement in the performance of a graph-based WSD system when applied to a standard evaluation resource. © 2011 ACM.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

A knowledge discovery and reuse pipeline for information extraction in clinical notes

引用

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION 2011年第5期18卷 574-579页

作者： Patrick, Jon D. Nguyen, Dung H. M. Wang, Yefeng Li, Min Univ Sydney Sch Informat Technol Fac Engn & IT Sydney NSW 2006 Australia

Objective Information extraction and classification of clinical data are current challenges in natural language processing. This paper presents a cascaded method to deal with three different extractions and classifications in clinical data: concept annotation, assertion classification and relation classification. Materials and methods A pipeline system was developed for clinical natural language processing that includes a proofreading process, with gold-standard reflexive validation and correction. The information extraction system is a combination of a machine learning approach and a rule-based approach. The outputs of this system are used for evaluation in all three tiers of the fourth i2b2/VA shared-task and workshop challenge. Results Overall concept classification attained an F-score of 83.3% against a baseline of 77.0%, the optimal F-score for assertions about the concepts was 92.4% and relation classifier attained 72.6% for relationships between clinical concepts against a baseline of 71.0%. Micro-average results for the challenge test set were 81.79%, 91.90% and 70.18%, respectively. Discussion The challenge in the multi-task test requires a distribution of time and work load for each individual task so that the overall performance evaluation on all three tasks would be more informative rather than treating each task assessment as independent. The simplicity of the model developed in this work should be contrasted with the very large feature space of other participants in the challenge who only achieved slightly better performance. There is a need to charge a penalty against the complexity of a model as defined in message minimalisation theory when comparing results. Conclusion A complete pipeline system for constructing language processing models that can be used to process multiple practical detection tasks of language structures of clinical records is presented.

关键词： agents automated learning classification clinical controlled terminologies and vocabularies designing usable (responsive) resources and systems discovery distributed systems information classification information extraction i2b2 challenge knowledge bases natural language processing ontologies software engineering: architecture text and data mining methods 2010 i2b2 challenge

来源：评论

学校读者我要写书评

暂无评论

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text

引用

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION 2011年第5期18卷 552-556页

作者： Uzuner, Oezlem South, Brett R. Shen, Shuying DuVall, Scott L. SUNY Albany Coll Comp & Informat Dept Informat Studies Albany NY 12222 USA VA Salt Lake City Hlth Care Syst Salt Lake City UT USA Univ Utah Dept Internal Med Salt Lake City UT 84112 USA Univ Utah Dept Biomed Informat Salt Lake City UT USA

The 2010 i2b2/VA workshop on natural language processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports;an assertion classification task focused on assigning assertion types for medical problem concepts;and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification. These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate.

关键词： Information storage and retrieval (text and images) discovery and text and data mining methods Other methods of information extraction natural-language processing Automated learning visualization of data and knowledge uncertain reasoning and decision theory languages and computational methods statistical analysis of large datasets advanced algorithms discovery other methods of information extraction automated learning human-computer interaction and human-centered computing NLP machine learning Informatics

来源：评论

学校读者我要写书评

暂无评论

Lightly-Supervised Training for Hierarchical Phrase-based Machine Translation 1

Lightly-Supervised Training for Hierarchical Phrase-Based Ma...

引用

1st workshop on Unsupervised Learning in NLP at the 2011 Conference on Empirical methods in natural language processing, EMNLP 2011

作者： Huck, Matthias Vilar, David Stein, Daniel Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany DFKI GmbH Berlin Germany

ISBN: (纸本)1937284131

In this paper we apply lightly-supervised training to a hierarchical phrase-based statistical machine translation system. We employ bitexts that have been built by automatically translating large amounts of monolingual data as additional parallel training corpora. We explore different ways of using this additional data to improve our system. Our results show that integrating a second translation model with only non-hierarchical phrases extracted from the automatically generated bitexts is a reasonable approach. The translation performance matches the result we achieve with a joint extraction on all training bitexts while the system is kept smaller due to a considerably lower overall number of phrases. © 2011 Association for Computational Linguistics

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

ACL-IJCNLP 2009 - Textgraphs 2009: 2009 workshop on graph-based methods for natural language processing, proceedings of the workshop

ACL-IJCNLP 2009 - TextGraphs 2009: 2009 Workshop on Graph-Ba...

引用

4th workshop on graph-based methods for natural language processing, Textgraphs 2009

ISBN: (纸本)193243254X

The proceedings contain 12 papers. The topics discussed include: network analysis reveals structure indicative of syntax in the corpus of undeciphered Indus civilization inscriptions;bipartite spectral graph partitioning to co-cluster varieties and sound correspondences in dialectology;WikiWalk: random walks on Wikipedia for semantic relatedness;classifying Japanese polysemous verbs based on Fuzzy C-means clustering;measuring semantic relatedness with vector space models and random walks;ranking and semi-supervised classification on large scale graphs using Map-Reduce;opinion graphs for polarity and discourse classification;a cohesion graph based approach for unsupervised recognition of literal and non-literal use of multiword expressions;social (distributed) language modeling, clustering and dialectometry;and quantitative analysis of treebanks using frequent subtree mining methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Finite-State methods and natural language processing - 8th International workshop, FSMNLP 2009, Revised Selected Papers

Finite-State Methods and Natural Language Processing - 8th I...

引用

8th International workshop on Finite-State methods and natural language processing, FSMNLP 2009

ISBN: (纸本)364214683X

The proceedings contain 14 papers. The topics discussed include: learning finite state machines;developing computational morphology for low- and middle-density languages;selected operations and applications of n-tape weighted finite-state machines;OpenFst;morphological analysis of tone marked Kinya-rwanda text;minimizing weighted tree grammars using simulation;reducing nondeterministic finite automata with sat solvers;joining composition and trimming of finite-state transducers;porting Basque morphological grammars to foma, an open-source tool;describing Georgian morphology with a finite-state system;finite state morphology of the nguni language cluster: modeling and implementation issues;a finite state approach to setswana verb morphology;and Zulu: an interactive learning competition.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A study on dependency tree kernels for automatic extraction of protein-protein interaction 49

A study on dependency tree kernels for automatic extraction ...

引用

2011 workshop on Biomedical natural language processing, BioNLP 2011 at the 49th Annual Meeting of the Association for Computational Linguistics: Human language Technologies, ACL-HLT 2011

作者： Chowdhury, Md. Faisal Mahbub Lavelli, Alberto Moschitti, Alessandro Department of Information Engineering and Computer Science University of Trento Italy Human Language Technology Research Unit Fondazione Bruno Kessler Trento Italy

ISBN: (纸本)9781932432916

Kernel methods are considered the most effective techniques for various relation extraction (RE) tasks as they provide higher accuracy than other approaches. In this paper, we introduce new dependency tree (DT) kernels for RE by improving on previously proposed dependency tree structures. These are further enhanced to design more effective approaches that we call mildly extended dependency tree (MEDT) kernels. The empirical results on the protein-protein interaction (PPI) extraction task on the AIMed corpus show that tree kernels based on our proposed DT structures achieve higher accuracy than previously proposed DT and phrase structure tree (PST) kernels. © 2011 Association for Computational Linguistics

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

Resources and methods for lexical substitution between basque dialects

Resources and methods for lexical substitution between basqu...

引用

workshop on Iberian Cross-language natural language processing Tasks, ICL 2011

作者： Uria, Larraitz Hulden, Mans Etxeberria, Izaskun Alegria, Iñaki IKER IKERBASQUE UMR5478 Spain Language Technology University of Helsinki Finland IXA Taldea UPV-EHU Finland

The coexistence of five languages with offcial status in the Iberian Peninsula (Basque, Catalan, Galician, Portuguese, and Spanish), has prompted collaborative efforts to share and cross-develop resources and materials for these languages of the region. However, it is not the case that comprehension boundaries only exist between each of these five languages;dialectal variation is also present, and in the case of Basque, for example, many written resources are only available in dialectal (or pre-standardization) form. At the same time, all the computational tools developed for Basque are based on the standard language ("Batua"), and will not work correctly with other dialects, of which there are many. In this work we attempt to semiautomatically deduce relationships between the standard Basque and dialectal variants. Such an effort provides an opportunity to apply existing tools to texts issued before a unified standard Basque was developed, and so take advantage of a rich source of linguistic information.

关键词： Inductive logic programming (ILP)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：