In this paper, we present GrawlTCQ, a new bootstrapping algorithm for building specialized terminology, corpora and queries, based on a graph model. We model links between documents, terms and queries, and use a rando...
详细信息
We propose the use of a nonparametric Bayesian model, the Hierarchical Dirichlet Process (HDP), for the task of Word Sense Induction. Results are shown through comparison against Latent Dirichlet Allocation (LDA), a p...
详细信息
Usually unsupervised dependency parsing tries to optimize the probability of a corpus by modifying the dependency model that was presumably used to generate the corpus. In this article we explore a different view in w...
详细信息
The proceedings contain 25 papers. The topics discussed include: not all links are equal: exploiting dependency types for the extraction of protein-protein interactions from text;unsupervised entailment detection betw...
ISBN:
(纸本)9781932432916
The proceedings contain 25 papers. The topics discussed include: not all links are equal: exploiting dependency types for the extraction of protein-protein interactions from text;unsupervised entailment detection between dependency graph fragments;learning phenotype mapping for integrating large genetic data;EVEX: a PubMed-scale resource for homology-based generalization of text mining predictions;fast and simple semantic class assignment for biomedical text;the role of information extraction in the design of a document triage application for biocuration;medical entity recognition: a comparison of semantic and statistical methods;automatic acquisition of huge training data for bio-medical named entity recognition;and building frame-based corpus on the basis of ontological domain knowledge.
There is high demand for automated tools that assign polarity to microblog content such as tweets (Twitter posts), but this is challenging due to the terseness and informality of tweets in addition to the wide variety...
详细信息
Recent years’ most efficient approaches for language understanding are statistical. These approaches benefit from a segmental semantic annotation of corpora. To reduce the production cost of such corpora, this paper ...
详细信息
Medical Entity Recognition is a crucial step towards efficient medical texts analysis. In this paper we present and compare three methodsbased on domain-knowledge and machine-learning techniques. We study two researc...
详细信息
The proceedings contain 17 papers. The topics discussed include: graph-based clustering for computational linguistics: a survey;towards the automatic creation of a wordnet from a term-based lexical network;an investig...
ISBN:
(纸本)1932432779
The proceedings contain 17 papers. The topics discussed include: graph-based clustering for computational linguistics: a survey;towards the automatic creation of a wordnet from a term-based lexical network;an investigation on the influence of frequency on the lexical organization of verbs;robust and efficient page rank for word sense disambiguation;hierarchical spectral partitioning of bipartite graphs to cluster dialects and identify distinguishing features;a character-based intersection graph approach to linguistic phylogeny;spectral approaches to learning in the graph domain;cross-lingual comparison between distributionally determined word similarity networks;co-occurrence cluster features for lexical substitutions in context;contextually-mediated semantic similarity graphs for topic segmentation;and experiments with CST-based multidocument summarization.
Training efficient statistical approaches for naturallanguage understanding generally requires data with segmental semantic annotations. Unfortunately, building such resources is costly. In this paper, we propose an ...
详细信息
The proceedings contain 11 papers. The topics discussed include: the 1st DDIExtraction-2011 challenge task: extraction of drug-drug interactions from biomedical texts;relation extraction for drug-drug interactions usi...
The proceedings contain 11 papers. The topics discussed include: the 1st DDIExtraction-2011 challenge task: extraction of drug-drug interactions from biomedical texts;relation extraction for drug-drug interactions using ensemble learning;two different machine learning techniques for drug-drug interaction extraction;drug-drug interaction extraction using composite kernels;drug-drug interaction extraction with RLS and SVM classifiers;feature selection for drug-drug interaction detection using machine-learning based approaches;automatic drug-drug interaction detection: a machine learning approach with maximal frequent sequence extraction;a machine learning approach to extract drug - drug interactions in an unbalanced dataset;drug-drug interactions discovery based on CRFs SVMs and rule-basedmethods;an experimental exploration of drug-drug interaction extraction from biomedical texts;and extraction of drug-drug interactions using all paths graph kernel.
暂无评论