We introduce an interlingua-based approach to cross-language information retrieval, in which queries, as well as documents, are mapped onto a language-independent concept layer and retrieval operations are performed a...
详细信息
ISBN:
(纸本)1577352017
We introduce an interlingua-based approach to cross-language information retrieval, in which queries, as well as documents, are mapped onto a language-independent concept layer and retrieval operations are performed at the level of that interlingua. This approach is contrasted with one which operates without such an intermediary concept level. Non-English queries (German ones, in our experiments) are directly translated to English queries which, subsequently, are processed on English documents. We provide an empirical evaluation of both alternatives on a large medical document collection.
Document retrieval in languages with a rich and complex morphology - particularly in terms of derivation and (single-word) composition - suffers from serious performance degradation with the stemming-only query-term-t...
详细信息
We describe an ontology engineering methodology by which conceptual knowledge is extracted from an informal medical thesaurus (UMLS) and automatically converted into a formal description logics system. Our approach co...
详细信息
We describe an ontology engineering methodology by which conceptual knowledge is extracted from an informal medical thesaurus (UMLS) and automatically converted into a formal description logics system. Our approach consists of four steps: concept definitions are automatically generated from the UMLS source, integrity checking of taxonomic and partonomic hierarchies is performed by the terminological classifier, cycles and inconsistencies are eliminated, and incremental refinement of the evolving knowledge base is performed by a domain expert. We report on experiments with a knowledge base composed of 164,000 concepts and 76,000 relations.
Document retrieval in languages with a rich and complex morphology - particularly in terms of derivation and (single-word) composition - suffers from serious performance degradation with the stemming-only query-term-t...
详细信息
In biomedical documents, there is ample evidence for complex morphological structures in specialized terms. While inflection is relatively easy to deal with, productive morphological processes such as derivation and s...
详细信息
ISBN:
(纸本)1586032798
In biomedical documents, there is ample evidence for complex morphological structures in specialized terms. While inflection is relatively easy to deal with, productive morphological processes such as derivation and single-word composition constitute a major challenge. Considering the problem from an information retrieval perspective, we split morphologically complex words into biomedically significant, morpheme-like subwords and match subwords the query terms and document terms are composed of. This way, morphologically motivated word form alterations can be eliminated from the retrieval procedure. Based on a series of retrieval experiments, we have gathered evidence that subword-based indexing and retrieval-for the German biomedical sublanguage, at least-outperforms conventional string matching approaches.
We describe an ontology engineering methodology by which conceptual knowledge is extracted from an informal medical thesaurus (UMLS) and automatically converted into a formal description logics system (LOOM). Our appr...
详细信息
We describe an ontology engineering methodology by which conceptual knowledge is extracted from an informal medical thesaurus (UMLS) and automatically converted into a formal description logics system (LOOM). Our approach consists of four steps: concept definitions are automatically generated from the UMLS, integrity checking of taxonomic and partonomic hierarchies is performed by LOOM's terminological classifier, cycles and inconsistencies are eliminated, as well as incremental refinement of the evolving knowledge base is performed by a domain expert. We report on experiments with a very large knowledge base composed of 164,000 concepts and 76,000 relations.
We describe an ontology engineering methodology by which conceptual knowledge is extracted from an informal medical thesaurus (UMLS) and automatically converted into a formal description logics system (LOOM). Our appr...
详细信息
We describe an ontology engineering methodology by which conceptual knowledge is extracted from an informal medical thesaurus (UMLS) and automatically converted into a formal description logics system (LOOM). Our approach consists of four steps: concept definitions are automatically generated from the UMLS, Integrity checking of taxonomic and partonomic hierarchies is performed by LOOM's terminological classifier, cycles and inconsistencies are eliminated, as well as incremental refinement of the evolving knowledge base is performed by a domain expert. We report an experiments with a very large knowledge base composed of 164,000 concepts and 76,000 relations.
We introduce a methodology for automating the maintenance and growth of domain-specific concept taxonomies and grammatical class hierarchies simultaneously, based on knowledge capture from natural language texts. The ...
详细信息
ISBN:
(纸本)1581133804
We introduce a methodology for automating the maintenance and growth of domain-specific concept taxonomies and grammatical class hierarchies simultaneously, based on knowledge capture from natural language texts. The assimilation process is centered around the linguistic and conceptual 'quality' of various forms of evidence underlying the generation, assessment and on-going refinement of lexical and concept hypotheses. On the basis of the strength of evidence, hypotheses are ranked according to plausibility, and the most reasonable ones are selected for assimilation into the given lexical class hierarchy and domain ontology.
In order to transform a major portion of the Unified Medical Language System (UMLS) into a formally sound description logics system we have developed a four-step knowledgeengineering approach. We report on experiment...
详细信息
ISBN:
(纸本)1586030639
In order to transform a major portion of the Unified Medical Language System (UMLS) into a formally sound description logics system we have developed a four-step knowledgeengineering approach. We report on experiments with a LOOM knowledge base which consists of 164,000 concepts and 76,000 relations covering human anatomy and pathology. We discuss the fully automatized as well as interactive semiautomatic steps. The latter ones focus on the manual work necessary to render the formal target representation structures complete and adequate.
SynDiKATe is a system for automatically acquiring knowledge from real-world texts and transferring it to formal representation structures which constitute a textknowledge base. We present a system architecture which ...
详细信息
暂无评论