Describes a tool for quantitatively discriminating between meningioma and astrocytoma tumors. One of the uses of magnetic resonance imaging (MRI) in clinical diagnosis is in-vivo discrimination between tumor and norma...
详细信息
Describes a tool for quantitatively discriminating between meningioma and astrocytoma tumors. One of the uses of magnetic resonance imaging (MRI) in clinical diagnosis is in-vivo discrimination between tumor and normal tissue and between tumor types in the brain. There is much interest in increasing the qualitative and quantitative information available from these images. This article presents a study that uses the inductive logic programming tool Progol on measurements of signal intensities in clinical scan images of 28 patients (18 with meningiomas and 10 with astrocytomas) to attempt to discover knowledge that quantitatively dissriminates between the two types of tumors.
inductive logic programming (ILP) is a study of machine learning systems that use clausal theories in first-order logic as a representation language. In this paper, we survey theoretical foundations of ILP from the vi...
详细信息
inductive logic programming (ILP) is a study of machine learning systems that use clausal theories in first-order logic as a representation language. In this paper, we survey theoretical foundations of ILP from the viewpoints of logic of Discovery and Machine Learning, and try to unify these two views with the support of the modern theory of logicprogramming. Firstly, we define several hypothesis construction methods in ILP and give their proof-theoretic foundations by treating then as a procedure which complete incomplete proofs. Next, we discuss the design of individual learning algorithms using these hypothesis construction methods. We review known results on learning logic programs in computational learning theory, and show that these algorithms are instances of a generic learning strategy with proof completion methods.
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is it...
详细信息
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. We identify a number of parameters that influence the obtainable speedups, and validate and refine our analysis with experiments on a variety of data sets with two different implementations. Besides cross-validation, we also briefly explore the usefulness of these techniques for bagging. We conclude with some guidelines concerning when these optimizations should be considered.
作者:
Yamamoto, AHokkaido Univ
Fac Technol & Meme Media Lab Kita Ku Sapporo Hokkaido 0608628 Japan
For given logical formulae B and E such that B K E, hypothesis finding means the generation of a formula H such that Bboolean ANDH satisfies E. Hypothesis finding constitutes a basic technique for fields of inference,...
详细信息
For given logical formulae B and E such that B K E, hypothesis finding means the generation of a formula H such that Bboolean ANDH satisfies E. Hypothesis finding constitutes a basic technique for fields of inference, like inductive inference and knowledge discovery. In order to put various hypothesis finding methods proposed previously on one general ground, we use upward refinement and residue hypotheses. We show that their combination is a complete method for solving any hypothesis finding problem in clausal logic. We extend the relative subsumption relation, and show that some hypothesis finding methods previously presented can be regarded as finding hypotheses which subsume examples relative to a given background theory. Noting that the weakening rule may make hypothesis finding difficult to solve, we propose restricting this rule either to the inverse of resolution or to that of subsumption. We also note that this work is related to relevant logic. (C) 2002 Elsevier Science B.V. All rights reserved.
inductive logic programming (ILP) is a form of machine learning that induces rules from data using the language and syntax of logicprogramming. A rule construction algorithm forms rules that summarize data sets. Thes...
详细信息
ISBN:
(纸本)0780362624
inductive logic programming (ILP) is a form of machine learning that induces rules from data using the language and syntax of logicprogramming. A rule construction algorithm forms rules that summarize data sets. These rules can be used in a large spectrum of data mining activities. In ILP, the rules are constructed with a target predicate as the consequent, or head, of the rule, and with high-ranking literals forming the antecedent, or body, of the rule. The predicate rankings are obtained by applying predicate ranking algorithms to a domain (background) knowledge base. In this work, we present three new predicate ranking algorithms for the inductive logic programming system, INDED (pronounced "indeed"). The algorithms use a grouping technique employing basic set theoretic operations to generate the rankings. We also present results of applying the ranking algorithms to several problem domains, some of which are universal like the classical genealogy problem, and others, not so common. In particular, diagnosis is the main thread of many of our experiments. Here, although our experimentation relates to medical diagnosis in diabetes and Lyme disease, many of the same techniques and methodologies can be applied to other forms of diagnosis including system failure, sensor detection, and trouble-shooting.
In this paper, we propose an inductive logic programming learning method which aims at automatically extracting special Noun-Verb (N-V) pairs from a corpus in order to build up semantic lexicons based on Pustejovsky...
详细信息
In this paper, we propose an inductive logic programming learning method which aims at automatically extracting special Noun-Verb (N-V) pairs from a corpus in order to build up semantic lexicons based on Pustejovsky's Generative Lexicon (GL) principles (Pustejovsky, 1995). In one of the components of this lexical model, called the qualia structure, words are described in terms of semantic roles. For example, the telic role indicates the purpose or function of an item (cut for knife), the agentive role its creation mode (build for house), etc. The qualia structure of a noun is mainly made up of verbal associations, encoding relational information. The inductive logic programming learning method that we have developed enables us to automatically extract from a corpus N-V pairs whose elements are linked by one of the semantic relations defined in the qualia structure in GL, and to distinguish them, in terms of surrounding categorial context from N-V pairs also present in sentences of the corpus but not relevant. This method has been theoretically and empirically validated, on a technical corpus. The N-V pairs that have been extracted will further be used in information retrieval applications for index expansion.
The study of protein structure has been driven largely by the careful inspection of experimental data by human experts. However, the rapid determination of protein structures from structural-genomics projects will mak...
详细信息
The study of protein structure has been driven largely by the careful inspection of experimental data by human experts. However, the rapid determination of protein structures from structural-genomics projects will make it increasingly difficult to analyse (and determine the principles responsible for) the distribution of proteins in fold space by inspection alone. Here, we demonstrate a machine-learning strategy that automatically determines the structural principles describing 45 folds. The rules learnt were shown to be both statistically significant and meaningful to protein experts. With the increasing emphasis on high-throughput experimental initiatives, machine-learning and other automated methods of analysis will become increasingly important for many biological problems. (C) 2003 Elsevier Ltd. All rights reserved.
The interest of introducing fuzzy predicates when learning rules is twofold. When dealing with numerical data, it enables us to avoid arbitrary discretization. Moreover, it enlarges the expressive power of what is lea...
详细信息
ISBN:
(纸本)3540200851
The interest of introducing fuzzy predicates when learning rules is twofold. When dealing with numerical data, it enables us to avoid arbitrary discretization. Moreover, it enlarges the expressive power of what is learned by considering different types of fuzzy rules, which may describe gradual behaviors of related attributes or uncertainty pervading conclusions. This paper describes different types of first-order fuzzy rules and a method for learning each type. Finally, we discuss the interest of each type of rules on a benchmark example.
The paper presents an algorithm based on inductive logic programming for inducing first order Horn clauses involving fuzzy predicates from a database. For this, a probabilistic processing of fuzzy function is used, in...
详细信息
ISBN:
(纸本)3540403833
The paper presents an algorithm based on inductive logic programming for inducing first order Horn clauses involving fuzzy predicates from a database. For this, a probabilistic processing of fuzzy function is used, in agreement with the handling of probabilities in first order logic. This technique is illustrated on an experimental application. The interest of learning fuzzy first order logic expressions is emphasized.
A first-order Bayesian network (FOBN) is an extension of first-order logic in order to cope with uncertainty problems. Therefore, learning an FOBN might be a good idea to build an effective classifier. However, becaus...
详细信息
ISBN:
(纸本)3540403000
A first-order Bayesian network (FOBN) is an extension of first-order logic in order to cope with uncertainty problems. Therefore, learning an FOBN might be a good idea to build an effective classifier. However, because of a complication of the FOBN, directly learning it from relational data is difficult. This paper proposes another way to learn FOBN classifiers. We adapt inductive logic programming (ILP) and a Bayesian network learner to construct the FOBN. To do this, we propose a feature extraction algorithm to generate the significant parts (features) of ILP rules, and use these features as a main structure of the induced the FOBN. Next, to learn the remaining parts of the FOBN structure and its conditional probability tables by a standard Bayesian network learner, we also propose an efficient propositionalisation algorithm for translating the original data into the single table format. In this work, we provide a preliminary evaluation on the mutagenesis problem, a standard dataset for relational learning problem. The results are compared with the state-of-the-art ILP learner, the PROGOL system.
暂无评论