Until now, it is still unclear which set of features produces the best result in automatic genre classification on the web. Therefore, in the first set of experiments, we compared a wide range of content-based feature...
详细信息
One research goal in Second language Acquisition (SLA) is to formulate and test hypotheses about errors and the environments in which they are made, a process which often involves substantial effort;large amounts of d...
详细信息
The proceedings contain 11 papers. The topics discussed include: event-centered information retrieval using kernels on event graphs;reconstructing big semantic similarity networks;graph-based unsupervised learning of ...
ISBN:
(纸本)9781937284978
The proceedings contain 11 papers. The topics discussed include: event-centered information retrieval using kernels on event graphs;reconstructing big semantic similarity networks;graph-based unsupervised learning of word similarities using heterogeneous feature types;understanding seed selection in bootstrapping;graph-structures matching for review relevance identi?cation;automatic extraction of reasoning chains from textual reports;graph-based Approaches for organization entity resolution in MapReduce;and a graph-based approach to skill extraction from text.
Most of the recent work on machine learning-based temporal relation classification has been done by considering only a given pair of temporal entities (events or temporal expressions) at a time. Entities that have tem...
详细信息
In this paper we present the first application of Native language Identification (NLI) to Arabic learner data. NLI, the task of predicting a writer’s firstlanguage from their writing in other languages has been most...
详细信息
A multilingual person writing a sentence or a piece of text tends to switch between languages s/he is proficient in. This alteration between languages, commonly known as code-switching, presents us with the problem of...
详细信息
The proceedings contain 17 papers. The topics discussed include: foreign words and the automatic processing of Arabic social media text written in roman script;code mixing: a challenge for language identification in t...
ISBN:
(纸本)9781937284961
The proceedings contain 17 papers. The topics discussed include: foreign words and the automatic processing of Arabic social media text written in roman script;code mixing: a challenge for language identification in the language of social media;detecting code-switching in a multilingual alpine heritage corpus;exploration of the impact of maximum entropy in recurrent neural network language models for code-switching speech;predicting code-switching in multilingual communication for immigrant communities;overview for the first shared task on language identification in code-switched data;word-level language identification using CRF: code-switching shared task report of MSR India system;and the CMU submission for the shared task on language identification in code-switched data.
Background: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of naturallanguageprocessing (NLP) research in any domain including medicine. Although parsers develop...
详细信息
Background: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of naturallanguageprocessing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. methods: In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline;and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. first, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results: Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measur
Segmenting human hand is important in computer vision applications, for example, sign language interpretation, human computer interaction, and gesture recognition. However, some serious bottlenecks still exist in hand...
详细信息
Segmenting human hand is important in computer vision applications, for example, sign language interpretation, human computer interaction, and gesture recognition. However, some serious bottlenecks still exist in hand localization systems such as fast hand motion capture, hand over face, and hand occlusions on which we focus in this paper. We present a novel method for hand tracking and segmentation based on augmented graph cuts and dynamic model. first, an effective dynamic model for state estimation is generated, which correctly predicts the location of hands probably having fast motion or shape deformations. Second, new energy terms are brought into the energy function to develop augmented graph cuts based on some cues, namely, spatial information, hand motion, and chamfer distance. The proposed method successfully achieves hand segmentation even though the hand passes over other skin-colored objects. Some challenging videos are provided in the case of hand over face, hand occlusions, dynamic background, and fast motion. Experimental results demonstrate that the proposed method is much more accurate than other graph cuts-basedmethods for hand tracking and segmentation.
暂无评论