检索结果-内蒙古大学图书馆

8th workshop on Statistical Machine Translation, WMT 2013

作者： Niehues, Jan Waibel, Alex Institute for Anthropomatics Karlsruhe Institute of Technology Germany

ISBN: (纸本)9781937284572

The Discriminative Word Lexicon (DWL) is a maximum-entropy model that predicts the target word probability given the source sentence words. We present two ways to extend a DWL to improve its ability to model the word translation probability in a phrase-based machine translation (PBMT) system. While DWLs are able to model the global source information, they ignore the structure of the source and target sentence. We propose to include this structure by modeling the source sentence as a bag-of-n-grams and features depending on the surrounding target words. Furthermore, as the standard DWL does not get any feedback from the MT system, we change the DWL training process to explicitly focus on addressing MT errors. By using these methods we are able to improve the translation performance by up to 0.8 BLEU points compared to a system that uses a standard DWL. © 2013 Association for Computational Linguistics

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

workshop on Unsupervised Learning in NLP at the 2011 Conference on Empirical methods in natural language processing, EMNLP 2011 - proceedings

Workshop on Unsupervised Learning in NLP at the 2011 Confere...

引用

1st workshop on Unsupervised Learning in NLP at the 2011 Conference on Empirical methods in natural language processing, EMNLP 2011

ISBN: (纸本)1937284131

The proceedings contain 13 papers. The topics discussed include: structured databases of named entities from Bayesian nonparametrics;unsupervised cross-lingual lexical substitution;reducing the size of the representation for the uDOP-estimate;evaluating unsupervised learning for natural language processing tasks;unsupervised language-independent name translation mining from Wikipedia infoboxes;twitter polarity classification with label propagation over lexical links and the follower graph;unsupervised concept annotation using latent Dirichlet allocation and segmental methods;and unsupervised alignment for segmental-based language understanding.

关键词：

来源：评论

学校读者我要写书评

暂无评论

引用

7th workshop on graph-based methods for natural language processing, Textgraphs 2012

作者： Minkov, Einat Cohen, William W. Dep. of Information Systems University of Haifa Haifa 31905 Israel School of Computer Science Carnegie Mellon University Pittsburgh PA 15213 United States

ISBN: (纸本)9781937284374

We learn graph-based similarity measures for the task of extracting word synonyms from a corpus of parsed text. A constrained graph walk variant that has been successfully applied in the past in similar settings is shown to outperform a state-of-the-art syntactic vectorbased approach on this task. Further, we show that learning specialized similarity measures for different word types is advantageous. © 2012 The Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features

引用

BMC MEDICAL INFORMATICS AND DECISION MAKING 2013年第1-Sup期13卷 S1-S1页

作者： Tang, Buzhou Cao, Hongxin Wu, Yonghui Jiang, Min Xu, Hua Univ Texas Hlth Sci Ctr Houston Sch Biomed Informat Houston TX 77030 USA Shenzhen Grad Sch Harbin Inst Technol Shenzhen Peoples R China Second Mil Med Univ Shanghai Peoples R China

Background: Named entity recognition (NER) is an important task in clinical natural language processing (NLP) research. Machine learning (ML) based NER methods have shown good performance in recognizing entities in clinical text. Algorithms and features are two important factors that largely affect the performance of ML-based NER systems. Conditional Random Fields (CRFs), a sequential labelling algorithm, and Support Vector Machines (SVMs), which is based on large margin theory, are two typical machine learning algorithms that have been widely applied to clinical NER tasks. For features, syntactic and semantic information of context words has often been used in clinical NER systems. However, Structural Support Vector Machines (SSVMs), an algorithm that combines the advantages of both CRFs and SVMs, and word representation features, which contain word-level back-off information over large unlabelled corpus by unsupervised algorithms, have not been extensively investigated for clinical text processing. Therefore, the primary goal of this study is to evaluate the use of SSVMs and word representation features in clinical NER tasks. methods: In this study, we developed SSVMs-based NER systems to recognize clinical entities in hospital discharge summaries, using the data set from the concept extration task in the 2010 i2b2 NLP challenge. We compared the performance of CRFs and SSVMs-based NER classifiers with the same feature sets. Furthermore, we extracted two different types of word representation features (clustering-based representation features and distributional representation features) and integrated them with the SSVMs-based clinical NER system. We then reported the performance of SSVM-based NER systems with different types of word representation features. Results and discussion: Using the same training (N = 27,837) and test (N = 45,009) sets in the challenge, our evaluation showed that the SSVMs-based NER systems achieved better performance than the CRFs-based sy

关键词： natural language processing Conditional Random Field Unify Medical language System Name Entity Recognition Entity Recognition

来源：评论

学校读者我要写书评

暂无评论

Cause-effect relation learning

Cause-effect relation learning

引用

7th workshop on graph-based methods for natural language processing, Textgraphs 2012

作者： Kozareva, Zornitsa USC Information Sciences Institute 4676 Admiralty Way Marina del Rey CA United States

ISBN: (纸本)9781937284374

To be able to answer the question What causes tumors to shrink?, one would require a large cause-effect relation repository. Many efforts have been payed on is-a and part-of relation leaning, however few have focused on cause-effect learning. This paper describes an automated bootstrapping procedure which can learn and produce with minimal effort a cause-effect term repository. To filter out the erroneously extracted information, we incorporate graph-based methods. To evaluate the performance of the acquired cause-effect terms, we conduct three evaluations: (1) human-based, (2) comparison with existing knowledge bases and (3) application driven (SemEval-1 Task 4) in which the goal is to identify the relation between pairs of nominals. The results show that the extractions at rank 1500 are 89% accurate, they comprise 61% from the terms used in the SemEval-1 Task 4 dataset and can be used in the future to produce additional training examples for the same task. © 2012 The Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

BioNLP@HLT-NAACL 2012 - workshop on Biomedical natural language processing, proceedings

BioNLP@HLT-NAACL 2012 - Workshop on Biomedical Natural Langu...

引用

2012 workshop on Biomedical natural language processing, BioNLP@HLT-NAACL 2012

ISBN: (纸本)9781937284206

The proceedings contain 30 papers. The topics discussed include: graph-based alignment of narratives for automated neurological assessment;bootstrapping biomedical ontologies for scientific text using nell;semantic distance and terminology structuring methods for the detection of semantically close terms;temporal classification of medical events;analyzing patient records to establish if and when a patient suffered from a medical condition;alignment-HMM-based extraction of abbreviations from biomedical text;medical diagnosis lost in translation – analysis of uncertainty and negation expressions in English and Swedish clinical texts;and a hybrid stepwise approach for de-identifying person names in clinical documents.

关键词：

来源：评论

学校读者我要写书评

暂无评论

TextInfer 2011 - workshop on Textual Entailment at the Conference on Empirical methods in natural language processing, EMNLP 2011 - proceedings

TextInfer 2011 - Workshop on Textual Entailment at the Confe...

引用

2011 workshop on Textual Entailment, TextInfer 2011 at the Conference on Empirical methods in natural language processing, EMNLP 2011

ISBN: (纸本)1937284158

The proceedings contain 8 papers. The topics discussed include: evaluating answers to reading comprehension questions in context: results for German and the role of information structure;towards a probabilistic model for lexical entailment;classification-based contextual preferences;is it worth submitting this run? assess your RTE system with a good sparring partner;diversity-aware evaluation for paraphrase patterns;representing and resolving ambiguities in ontology-based question answering;strings over intervals;and discovering commonsense entailment rules implicit in sentences.

关键词：

来源：评论

学校读者我要写书评

暂无评论

FSMNLP 2011 - proceedings of the 9th International workshop Finite State methods and natural language processing

FSMNLP 2011 - Proceedings of the 9th International Workshop ...

引用

9th International workshop Finite State methods and natural language processing, FSMNLP 2011

The proceedings contain 17 papers. The topics discussed include: intersection for weighted formalisms;modularization of regular growth automata;finite-state representations embodying temporal relation;supervised and semi-supervised sequence learning for recognition of requisite part and effectuation part in law sentences;compiling simple context restrictions with nondeterministic automata;constraint grammar parsing with left and right sequential finite transducers;e-dictionaries and finite-state automata for the recognition of named entities;a practical algorithm for intersecting weighted context-free grammars with finite-state automata;open source WFST tools for LVCSR cascade development;intersection of multitape transducers vs. cascade of binary transducers: the example of Egyptian hieroglyphs transliteration;and a note on sequential rule-based POS tagging.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Common data model for natural language processing based on two existing standard information models: CDA+GrAF

引用

JOURNAL OF BIOMEDICAL INFORMATICS 2012年第4期45卷 703-710页

作者： Meystre, Stephane M. Lee, Sanghoon Jung, Chai Young Chevrier, Raphael D. Univ Utah Dept Biomed Informat Sch Med Salt Lake City UT 84112 USA VA Salt Lake City Hlth Care Syst Salt Lake City UT USA Univ Geneva Sch Med CH-1211 Geneva Switzerland

An increasing need for collaboration and resources sharing in the natural language processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO graph Annotation Format (GrAF;in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. (c) 2011 Elsevier Inc. All rights reserved.

关键词： natural language processing (MeSH L01.224.065.580) Medical informatics (L01.700) Data model Information model HL7 Clinical Document Architecture ISO graph Annotation Format

来源：评论

学校读者我要写书评

暂无评论

DAGGER: A toolkit for automata on directed acyclic graphs 10

DAGGER: A toolkit for automata on directed acyclic graphs

引用

10th International workshop on Finite State methods and natural language processing, FSMNLP 2012

作者： Quernheim, Daniel Knight, Kevin Institute for Natural Language Processing Universität Stuttgart Pfaffenwaldring 5b Stuttgart70569 Germany University of Southern California Information Sciences Institute Marina del ReyCA90292 United States

This paper presents DAGGER, a toolkit for finite-state automata that operate on directed acyclic graphs (dags). The work is based on a model introduced by (Kamimura and Slutzki, 1981;Kamimura and Slutzki, 1982), with a few changes to make the automata more applicable to natural language processing. Available algorithms include membership checking in bottom-up dag acceptors, transduction of dags to trees (bottom-up dag-to-tree transducers), k-best generation and basic operations such as union and intersection. © 2012 Association for Computational Linguistics.

关键词： Forestry

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：