作者:
Divina, FTilburg Univ
Computat Linguist & AI Sect NL-5000 LE Tilburg Netherlands
This paper presents an overview of evolutionary approaches to inductive logic programming (ILP). After a short description of the two popular ILP systems FOIL and Progol, we focus on methods based on evolutionary algo...
详细信息
This paper presents an overview of evolutionary approaches to inductive logic programming (ILP). After a short description of the two popular ILP systems FOIL and Progol, we focus on methods based on evolutionary algorithms (EAs). Six systems are described and compared by means of the following aspects: search strategy, representation, hypothesis evaluation, search operators and biases adopted for limiting the hypothesis space. We discuss possible advantages and drawbacks related to the specific features of the systems along these aspects. Issues concerning the relative performance and efficiency of the systems are addressed.
The paper identifies several new properties of the lattice induced by the subsumption relation over first-order clauses and derives implications of these for learnability. In particular, it is shown that the length of...
详细信息
The paper identifies several new properties of the lattice induced by the subsumption relation over first-order clauses and derives implications of these for learnability. In particular, it is shown that the length of subsumption chains of function free clauses with bounded size can be exponential in the size. This suggests that simple algorithmic approaches that rely on repeating minimal subsumption-based refinements may require a long time to converge. It is also shown that with bounded size clauses the subsumption lattice has a large branching factor. This is used to show that the class of first-order length-bounded monotone clauses is not properly learnable from membership queries alone. Finally, the paper studies pairing, a generalization operation that takes two clauses and returns a number of possible generalizations. It is shown that there are clauses with an exponential number of pairing results which are not related to each other by subsumption. This is used to show that recent pairing-based algorithms can make exponentially many queries on some learning problems. (c) 2005 Elsevier Inc. All rights reserved.
The least general generalization (LGG) of strings may cause an over-generalization in the generalization process of the clauses of predicates with string arguments. We propose a specific generalization (SG) for string...
详细信息
The least general generalization (LGG) of strings may cause an over-generalization in the generalization process of the clauses of predicates with string arguments. We propose a specific generalization (SG) for strings to reduce over-generalization. SGs of strings are used in the generalization of a set of strings representing the arguments of a set of positive examples of a predicate with string arguments. In order to create a SG of two strings, first, a unique match sequence between these strings is found. A unique match sequence of two strings consists of similarities and differences to represent similar parts and differing parts between those strings. The differences in the unique match sequence are replaced to create a SG of those strings. In the generalization process, a coverage algorithm based on SGs of strings or learning heuristics based on match sequences are used.
A program-transformation system is determined by a repertoire of correctness-preserving rules, such as folding and unfolding. Normally, we would like the folding rule to be in some sense the inverse of the unfolding r...
详细信息
A program-transformation system is determined by a repertoire of correctness-preserving rules, such as folding and unfolding. Normally, we would like the folding rule to be in some sense the inverse of the unfolding rule. Typically, however, the folding rule of logic program transformation systems is an inverse of a limited kind of unfolding. In many cases this limited kind of folding suffices. We argue, nevertheless, that it is both important and possible to extend such a folding so as to be able to fold the clauses resulting from any unfolding of a positive literal. This extended folding rule allows us to derive some programs underivable by the existing version of this rule alone. In addition, our folding rule has applications to decompilation and reengineering, where we are interested in obtaining high-level program constructs from low-level program constructs. Moreover, we establish a connection between logic program transformation and inductive logic programming. This connection stems from viewing our folding rule as a common extension of the existing multiple-clause folding, rule, on the one hand, and an operator devised in inductive logic programming, called "intra-construction," on the other hand. Hence, our folding rule can be regarded as a step towards incorporating inductive inference into logic program transformation. We prove correctness with respect to Dung and Kanchanasut's semantic kernel.
We develop kernels for measuring the similarity between relational instances using background knowledge expressed in first-order logic. The method allows us to bridge the gap between traditional inductivelogic progra...
详细信息
We develop kernels for measuring the similarity between relational instances using background knowledge expressed in first-order logic. The method allows us to bridge the gap between traditional inductive logic programming (ILP) representations and statistical approaches to supervised learning. logic programs are first used to generate proofs of given visitor programs that use predicates declared in the available background knowledge. A kernel is then defined over pairs of proof trees. The method can be used for supervised learning tasks and is suitable for classification as well as regression. We report positive empirical results on Bongard-like and M-of-N problems that are difficult or impossible to solve with traditional ILP techniques, as well as on real bioinformatics and chemoinformatics data sets.
We present a trainable sequential-inference technique for processes with large state and observation spaces and relational structure. We apply our technique to the problem of force-dynamic state inference from video, ...
详细信息
We present a trainable sequential-inference technique for processes with large state and observation spaces and relational structure. We apply our technique to the problem of force-dynamic state inference from video, which is a critical component of the LEONARD [J.M. Siskind, Grounding lexical semantics of verbs in visual perception using force dynamics and event logic, Journal of Artificial Intelligence Research 15 (2001) 31-90] visual-event recognition system. LEONARD uses event definitions that are grounded in force-dynamic primitives-making robust and efficient force-dynamic inference critical to good performance. Our sequential-inference method assumes "reliable observations", i.e., that each process state (e.g., force-dynamic state) persists long enough to be reliably inferred from the observations (e.g., video frames) it generates. We introduce the idea of a "state-inference function" (from observation sequences to underlying hidden states) for representing knowledge about a process and develop an efficient sequential-inference algorithm, utilizing this function, that is correct for processes that generate reliable observations consistent with the state-inference function. We describe a representation for state-inference functions in relational domains and give a corresponding supervised learning algorithm. Our experiments in force-dynamic state inference show that our technique provides significantly improved accuracy and speed relative to a variety of recent, hand-coded, non-trainable systems, and a trainable system based on probabilistic modeling. (C) 2006 Published by Elsevier B.V.
Many domains in the field of inductive logic programming (ILP) involve highly unbalanced data. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy. The ...
详细信息
Many domains in the field of inductive logic programming (ILP) involve highly unbalanced data. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy. The goal of our research is to find new approaches within ILP particularly suited for large, highly-skewed domains. We propose Gleaner, a randomized search method that collects good clauses from a broad spectrum of points along the recall dimension in recall-precision curves and employs an "at least L of these K clauses" thresholding method to combine sets of selected clauses. Our research focuses on Multi-Slot Information Extraction (IE), a task that typically involves many more negative examples than positive examples. We formulate this problem into a relational domain, using two large testbeds involving the extraction of important relations from the abstracts of biomedical journal articles. We compare Gleaner to ensembles of standard theories learned by Aleph, finding that Gleaner produces comparable testset results in a fraction of the training time.
We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each fo...
详细信息
We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach.
Recent statistical performance studies of search algorithms in difficult combinatorial problems have demonstrated the benefits of randomising and restarting the search procedure. Specifically, it has been found that i...
详细信息
Recent statistical performance studies of search algorithms in difficult combinatorial problems have demonstrated the benefits of randomising and restarting the search procedure. Specifically, it has been found that if the search cost distribution of the non-restarted randomised search exhibits a slower-than-exponential decay (that is, a "heavy tail"), restarts can reduce the search cost expectation. We report on an empirical study of randomised restarted search in ILP. Our experiments conducted on a high-performance distributed computing platform provide an extensive statistical performance sample of five search algorithms operating on two principally different classes of ILP problems, one represented by an artificially generated graph problem and the other by three traditional classification benchmarks (mutagenicity, carcinogenicity, finite element mesh design). The sample allows us to (1) estimate the conditional expected value of the search cost (measured by the total number of clauses explored) given the minimum clause score required and a "cutoff" value (the number of clauses examined before the search is restarted), (2) estimate the conditional expected clause score given the cutoff value and the invested search cost, and (3) compare the performance of randomised restarted search strategies to a deterministic non-restarted search. Our findings indicate striking similarities across the five search algorithms and the four domains, in terms of the basic trends of both the statistics (1) and (2). Also, we observe that the cutoff value is critical for the performance of the search algorithm, and using its optimal value in a randomised restarted search may decrease the mean search cost (by several orders of magnitude) or increase the mean achieved score significantly with respect to that obtained with a deterministic non-restarted search.
We study several complexity parameters for first order formulas and their suitability for first order learning models. We show that the standard notion of size is not captured by sets of parameters that are used in th...
详细信息
We study several complexity parameters for first order formulas and their suitability for first order learning models. We show that the standard notion of size is not captured by sets of parameters that are used in the literature and thus they cannot give a complete characterization in terms of learnability with polynomial resources. We then identify an alternative notion of size and a simple set of parameters that are useful for first order Horn Expressions. These parameters are the number of clauses in the expression, the maximum number of distinct terms in a clause, and the maximum number of literals in a clause. Matching lower bounds derived using the Vapnik Chervonenkis dimension complete the picture showing that these parameters are indeed crucial.
暂无评论