inductiveprogramminglogic (ILP)-based concept discovery systems aim to find patterns that describe a target relation in terms of other relations provided as background knowledge. Such systems usually work within fir...
详细信息
inductiveprogramminglogic (ILP)-based concept discovery systems aim to find patterns that describe a target relation in terms of other relations provided as background knowledge. Such systems usually work within first order logic framework, build large search spaces, and have long running times. Memoization has widely been incorporated in concept discovery systems to improve their running times. One of the problems that memoization brings to such systems is the memory overhead which may be a bottleneck. In this work we propose policies that decide what types of concept descriptors to store in memotables and for how long to keep them. The proposed policies have been implemented as extensions to a concept discovery system called Tabular CRIS wEF, and the resulting system is named Policy-based Tabular CRIS. Effects of the proposed policies are evaluated on several datasets. The experimental results show that the proposed policies greatly improve the memory consumption while preserving the benefits introduced by memoization.
Recently there has been an increasing amount of research on learning concepts expressed in subsets of Prolog;the term inductive logic programming (ILP) has been used to describe this growing body of research. This pap...
详细信息
Recently there has been an increasing amount of research on learning concepts expressed in subsets of Prolog;the term inductive logic programming (ILP) has been used to describe this growing body of research. This paper seeks to expand the theoretical foundations of ILP by investigating the pac-learnability of logic programs. We focus on programs consisting of a single function-free non-recursive clause, and focus on generalizations of a language known to be pac-learnable: namely, the language of determinate function-free clauses of constant depth. We demonstrate that a number of syntactic generalizations of this language are hard to learn, but that the language can be generalized to clauses of constant locality while still allowing pac-learnability. More specifically, we first show that determinate clauses of log depth are not pac-learnable, regardless of the language used to represent hypotheses. We then investigate the effect of allowing indeterminacy in a clause, and show that clauses with k indeterminate variables are as hard to learn as DNF. We next show that a more restricted language of clauses with bounded indeterminacy is learnable using k-CNF to represent hypotheses, and that restricting the ''locality'' of a clause to a constant allows pac-learnability even if an arbitrary amount of indeterminacy is allowed. This last result is also shown to be a strict generalization of the previous result for determinate function-free clauses of constant depth. Finally, we present some extensions of these results to logic programs with multiple clauses.
This paper contributes to the research on Learning in Databases in two ways. First, the concept of an inductive relation is introduced, as a natural development of other forms of intensional information, such as views...
详细信息
This paper contributes to the research on Learning in Databases in two ways. First, the concept of an inductive relation is introduced, as a natural development of other forms of intensional information, such as views and relations defined deductively. Second, a class of top-down methods for computing such inductive relations is analyzed, and major problems produced by recursive and interdependent relations are considered.
In this paper we investigate the learnability of relations in inductive logic programming, by using equality theories as background knowledge. We assume that a hypothesis and an observation are respectively a definite...
详细信息
In this paper we investigate the learnability of relations in inductive logic programming, by using equality theories as background knowledge. We assume that a hypothesis and an observation are respectively a definite program and a set of ground literals. The targets of our learning algorithm are relations. By using equality theories as background knowledge we introduce tree structure into definite programs. The structure enable us to narrow the search space of hypothesis. We give pairs of a hypothesis language and a knowledge language in order to discuss the learnability of relations from the view point of inductive inference and PAC learning.
Data preprocessing is an important component of machine learning pipelines, which requires ample time and resources. An integral part of preprocessing is data transformation into the format required by a given learnin...
详细信息
Data preprocessing is an important component of machine learning pipelines, which requires ample time and resources. An integral part of preprocessing is data transformation into the format required by a given learning algorithm. This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single table data representation, focusing on the propositionalization and embedding data transformation approaches. While both approaches aim at transforming data into tabular data format, they use different terminology and task definitions, are perceived to address different goals, and are used in different contexts. This paper contributes a unifying framework that allows for improved understanding of these two data transformation techniques by presenting their unified definitions, and by explaining the similarities and differences between the two approaches as variants of a unified complex data transformation task. In addition to the unifying framework, the novelty of this paper is a unifying methodology combining propositionalization and embeddings, which benefits from the advantages of both in solving complex data transformation and learning tasks. We present two efficient implementations of the unifying methodology: an instance-based PropDRM approach, and a feature-based PropStar approach to data transformation and learning, together with their empirical evaluation on several relational problems. The results show that the new algorithms can outperform existing relational learners and can solve much larger problems.
The automated construction of dynamic system models is an important application area for ILP. We describe a method that learns qualitative models from time-varying physiological signals. The goal is to understand the ...
详细信息
The automated construction of dynamic system models is an important application area for ILP. We describe a method that learns qualitative models from time-varying physiological signals. The goal is to understand the complexity of the learning task when faced with numerical data, what signal processing techniques are required, and how this affects learning. The qualitative representation is based on Kuipers' QSIM. The learning algorithm for model construction is based on Coiera's GENMODEL. We show that QSIM models are efficiently PAC learnable from positive examples only, and that GENMODEL is an ILP algorithm for efficiently constructing a QSIM model. We describe both GENMODEL which performs RLGG on qualitative states to learn a QSIM model, and the front-end processing and segmenting stages that transform a signal into a set of qualitative states. Next we describe results of experiments on data from six cardiac bypass patients. Useful models were obtained, representing both normal and abnormal physiological states. Model variation across time and across different levels of temporal abstraction and fault tolerance is explored. The assumption made by many previous workers that the abstraction of examples from data can be separated from the learning task is not supported by this study. Firstly, the effects of noise in the numerical data manifest themselves in the qualitative examples. Secondly, the models learned are directly dependent on the initial qualitative abstraction chosen.
The inductive synthesis of recursive logic programs from incomplete information, such as input/output examples, is a challenging subfield both of inductive logic programming (ILP) acid of the synthesis (in general) of...
详细信息
The inductive synthesis of recursive logic programs from incomplete information, such as input/output examples, is a challenging subfield both of inductive logic programming (ILP) acid of the synthesis (in general) of logic programs, from formal specifications. We first overview past and present achievements, focusing on the techniques that were designed specifically for the inductive synthesis of recursive logic programs but also discussing a few general ILP techniques that can also induce non-recursive hypotheses. Then we analyse the prospects of these techniques in this task, investigating their applicability to software engineering as well as to knowledge acquisition and discovery. (C) 1999 Elsevier Science Inc. All rights reserved.
In this paper we propose a new formalization of the inductive logic programming (ILP) problem for a better handling of exceptions. It is now encoded in first-order possibilistic logic. This allows us to handle excepti...
详细信息
In this paper we propose a new formalization of the inductive logic programming (ILP) problem for a better handling of exceptions. It is now encoded in first-order possibilistic logic. This allows us to handle exceptions by means of prioritized rules, thus taking lessons from non-monotonic reasoning. Indeed, in classical first-order logic, the exceptions of the rules that constitute a hypothesis accumulate and classifying an example in two different classes, even if one is the right one, is not correct. The possibilistic formalization provides a sound encoding of non-monotonic reasoning that copes with rules with exceptions and prevents an example to be classified in more than one class. The benefits of our approach with respect to the use of first-order decision lists are pointed out. The possibilistic logic view of ILP problem leads to an optimization problem at the algorithmic level. An algorithm based on simulated annealing that in one turn computes the set of rules together with their priority levels is proposed. The reported experiments show that the algorithm is competitive to standard ILP approaches on benchmark examples. (c) 2007 Elsevier B.V. All rights reserved.
We develop a general theoretical framework for statistical logical learning with kernels based on dynamic propositionalization, where structure learning corresponds to inferring a suitable kernel on logical objects, a...
详细信息
We develop a general theoretical framework for statistical logical learning with kernels based on dynamic propositionalization, where structure learning corresponds to inferring a suitable kernel on logical objects, and parameter learning corresponds to function learning in the resulting reproducing kernel Hilbert space. In particular, we study the case where structure learning is performed by a simple FOIL-like algorithm, and propose alternative scoring functions for guiding the search process. We present an empirical evaluation on several data sets in the single-task as well as in the multi-task setting.
暂无评论