Automated essay scoring is one of the most important educational applications of naturallanguageprocessing. Recently, researchers have begun exploring methods of scoring essays with respect to particular dimensions ...
详细信息
The proceedings contain 6 papers. The topics discussed include: preservation of recognizability for synchronous tree substitution grammars;a decoder for probabilistic synchronous tree insertion grammars;parsing and tr...
ISBN:
(纸本)1932432795
The proceedings contain 6 papers. The topics discussed include: preservation of recognizability for synchronous tree substitution grammars;a decoder for probabilistic synchronous tree insertion grammars;parsing and translation algorithms based on weighted extended tree transducers;millstream systems – a formal model for linking language modules by interfaces;transforming lexica as trees;and n-best parsing revisited.
The proceedings contain 6 papers. The topics discussed include: finding domain specific collocations and concordances on the web;unsupervised construction of a multilingual wordnet from parallel corpora;HMMs, GRs, and...
ISBN:
(纸本)9789544520106
The proceedings contain 6 papers. The topics discussed include: finding domain specific collocations and concordances on the web;unsupervised construction of a multilingual wordnet from parallel corpora;HMMs, GRs, and n-grams as lexical substitution techniques – are they portable to other languages?;evidence-based word alignment;and a discriminative approach to tree alignment.
This chapter introduces a class-based approach to ordering prenominal modifiers. Modifiers are grouped into broad classes based on where they tend to occur prenominally, and a framework is developed to order sets of m...
详细信息
ISBN:
(纸本)9783642155727
This chapter introduces a class-based approach to ordering prenominal modifiers. Modifiers are grouped into broad classes based on where they tend to occur prenominally, and a framework is developed to order sets of modifiers based on their classes. This system is developed to generate several orderings for sets of modifiers with more flexible positional constraints, and lends itself to bootstrapping for the classification of previously unseen modifiers. The approach to modifier classification outlined here is useful for automated language generation tasks, and the proposed modifier classes may be useful within constraint-based grammars.
More and more people have access to Internet and the content they produce keeps growing. The need to know what people are writing in blogs, forums and social media in general about specific brands and products has bec...
详细信息
More and more people have access to Internet and the content they produce keeps growing. The need to know what people are writing in blogs, forums and social media in general about specific brands and products has become strategically important for large corporations all around the globe. We will present the demo of an application, naturalOpinions, which follows a rule-based NLP approach to parsing opinion in Twitter in Spanish. The application is able to detect the topic and extract the sentiment, either positive or negative, about particular product features or brand names. Social media intelligence solutions can thus be implemented rapidly with Bitext's language technologies.
Many applications call for methods to enable automatic extraction of structured information from unstructured naturallanguage text. Due to inherent challenges of naturallanguageprocessing, most of the existing meth...
详细信息
ISBN:
(纸本)9780769542638
Many applications call for methods to enable automatic extraction of structured information from unstructured naturallanguage text. Due to inherent challenges of naturallanguageprocessing, most of the existing methods for information extraction from text tend to be domain specific. We explore a modular ontology-based approach to information extraction that decouples domain-specific knowledge from the rules used for information extraction. We describe a framework for extraction of a subset of complex nested relationships (e. g., Joe reports that Jim is a reliable employee). The extracted relationships are output in the form of sets of RDF (resource description framework) triples, which can be queried using query languages for RDF and mined for knowledge acquisition.
We propose a Named Entities transliteration mining system using Finite State Automata (FSA). We compare the proposed approach with a baseline system that utilizes the Editex technique to measure the length-normalized ...
详细信息
The proceedings contain 21 papers. The topics discussed include: two strong baselines for the BIONLP 2009 event extraction task;recognizing biomedical named entities using skip-chain conditional random fields;event ex...
详细信息
ISBN:
(纸本)1932432736
The proceedings contain 21 papers. The topics discussed include: two strong baselines for the BIONLP 2009 event extraction task;recognizing biomedical named entities using skip-chain conditional random fields;event extraction for post-translational modifications;scaling up biomedical event extraction to the entire PubMed;a comparative study of syntactic parsers for event extraction;arguments of nominals in semantic interpretation of biomedical text;improving summarization of biomedical documents using word sense disambiguation;cancer stage prediction based on patient online discourse;and an exploration of mining gene expression mentions and their anatomical locations from biomedical text.
Tools for designing signal processing systems with their semantic foundation in dataflow modeling often use high-level graphical user interface (GUI) or text basedlanguages that allow specifying applications as direc...
详细信息
Building a lattice-based taxonomy over a text corpus with formal concept analysis (FCA) methods requires preliminary text processing that would enable construction of a context. We consider several naturallanguage pr...
详细信息
Building a lattice-based taxonomy over a text corpus with formal concept analysis (FCA) methods requires preliminary text processing that would enable construction of a context. We consider several naturallanguageprocessingmethods aimed at automatic attribute acquisition from texts. In particular, we derive attributes of three types: frequent words, latent topics and named entities. Afterwards, we construct a context for each type taking documents in the corpus as a set of objects. Then the corresponding concept lattices are built and pruned with the help of stability index in order to improve the readability of the diagrams. The proposed technique is illustrated on a collection of 26 texts in English dealing with political domain. In this case, the technique serves as a tool for deeper understanding of the interests of different political actors producing political texts by clarifying the connections between notions they use in them.
暂无评论