In this work, we describe a methodology based on the Stochastic Finite State Transducers paradigm for Spoken language Understanding (SLU) for obtaining concept graphs from word graphs. In the edges of these concept gr...
详细信息
Kleene is a high-level programming language, based on the OpenFst library, for constructing and manipulating finite-state acceptors and transducers. Users can program using regular expressions, alternation-rule syntax...
详细信息
Inflection dictionaries are widely used in many naturallanguageprocessing tasks, especially for inflecting languages. However, they lack semantic information, which could increase the accuracy of such processing. Th...
详细信息
ISBN:
(纸本)9789898565167
Inflection dictionaries are widely used in many naturallanguageprocessing tasks, especially for inflecting languages. However, they lack semantic information, which could increase the accuracy of such processing. This paper describes a method to extract semantic labels from encyclopedic entries. Adding such labels to an inflection dictionary could eliminate the need of using ontologies and similar complex semantic structures for many typical tasks. A semantic label is either a single word or a sequence of words that describes the meaning of a headword, hence it is similar to a semantic category. However, no taxonomy of such categories is known prior to the extraction. Encyclopedic articles consist of headwords and their definitions, so the definitions are used as sources for semantic labels. The described algorithm has been implemented for extracting data from the Polish Wikipedia. It is based on definition structure analysis, heuristic methods and word form recognition and processing with use of the Polish Inflection Dictionary. This paper contains a description of the method and test results as well as discussion on possible further development.
Statistical naturallanguageprocessing (NLP) builds models of languagebased on statistical features extracted from the input text. We investigate deep learning methods for unsupervised feature learning for NLP tasks...
详细信息
Objectives To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce sub...
详细信息
Objectives To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. Materials and methods Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. Results graph-traversal heuristics provided high coverage (71-96% of terms in the test sets of discharge summaries) at the expense of subset size (17-51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24-55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. Discussion Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. Conclusion graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available.
This volume presents the proceedings of an international workshop held for amultidisciplinary group of researchers involved in intelligent tutoring systems research for language learning. The papers include work on: -...
ISBN:
(纸本)9783642772047
This volume presents the proceedings of an international workshop held for amultidisciplinary group of researchers involved in intelligent tutoring systems research for language learning. The papers include work on: - Computational bases, tools, and environments for delivering language instruction, - Theoretical frameworks for developing language-based architectures and computational grammars, - Pedagogical practice, learner characteristics, and learner performance data, - methods for representing tutoring and student modelling knowledge in the tutoring system, - Existing systems for language learning. The approach to developing intelligent tutoring systems that integrates naturallanguageprocessing in a multimedia environment is new. This book presents readers with the state of the art in the field in a single volume, with contributors from computer science, linguistics, and psychology.
Relational clustering has received much attention from researchers in the last decade. In this paper we present a parametric method that employs a combination of both hard and soft clustering. based on the correspondi...
详细信息
ISBN:
(纸本)9781627483445
Relational clustering has received much attention from researchers in the last decade. In this paper we present a parametric method that employs a combination of both hard and soft clustering. based on the corresponding Markov chain of an affinity matrix, we simulate a probability distribution on the states by defining a conditional probability for each subpopulation of states. This probabilistic model would enable us to use expectation maximization for parameter estimation. The effectiveness of the proposed approach is demonstrated on several real datasets against spectral clustering methods.
Social tagging systems, which allow users to freely annotate online resources with tags, become popular in the Web 2.0 era. In order to ease the annotation process, research on social tag recommendation has drawn much...
详细信息
ISBN:
(纸本)9781627483445
Social tagging systems, which allow users to freely annotate online resources with tags, become popular in the Web 2.0 era. In order to ease the annotation process, research on social tag recommendation has drawn much attention in recent years. Modeling the social tagging behavior could better reflect the nature of this issue and improve the result of recommendation. In this paper, we proposed a novel approach for bringing the associative ability to model the social tagging behavior and then to enhance the performance of automatic tag recommendation. To simulate human tagging process, our approach ranks the candidate tags on a weighted digraph built by the semantic relationships among meaningful words in the summary and the corresponding tags for a given resource. The semantic relationships are learnt via a word alignment model in statistical machine translation on large datasets. Experiments on real world datasets demonstrate that our method is effective, robust and language-independent compared with the state-of-the-art methods.
The proceedings contain 24 papers. The topics discussed include: gestures in assisted living environments;choosing and modeling the hand gesture database for a natural user interface;user experience of gesture based i...
ISBN:
(纸本)9783642341816
The proceedings contain 24 papers. The topics discussed include: gestures in assisted living environments;choosing and modeling the hand gesture database for a natural user interface;user experience of gesture based interfaces: a comparison with traditional interaction methods on pragmatic and hedonic qualities;low cost force-feedback interaction with haptic digital audio effects;the role of spontaneous gestures in spatial problem solving;effects of spectral features of sound on gesture type and timing;human-motion saliency in complex scenes;what, why, where and how do children think? towards a dynamic model of spatial cognition as action;a labanotation based ontology for representing dance movement;assessing agreement on segmentations by means of staccato, the segmentation agreement calculator according to Thomann;and how do iconic gestures convey visuo-spatial information? bringing together empirical, theoretical, and simulation studies.
This paper presents and experiments a new approach for automatic word sense disambiguation (WSD) applied for French texts. First, we are inspired from possibility theory by taking advantage of a double relevance measu...
详细信息
ISBN:
(纸本)9789573079255
This paper presents and experiments a new approach for automatic word sense disambiguation (WSD) applied for French texts. First, we are inspired from possibility theory by taking advantage of a double relevance measure (possibility and necessity) between words and their contexts. Second, we propose, analyze and compare two different training methods: judgment and dictionary based training. Third, we summarize and discuss the overall performance of the various performed tests in a global analysis way. In order to assess and compare our approach with similar WSD systems we performed experiments on the standard ROMANSEVAL test collection.
暂无评论