In this article, we discuss the problem of learning a classifier for relational data with two application examples from inductive logic programming (ILP). We present important techniques and questions of graph based a...
详细信息
In this article, we discuss the problem of learning a classifier for relational data with two application examples from inductive logic programming (ILP). We present important techniques and questions of graph based and logical learning, especially its computational complexity. Based on this discussion, the learning system INDIGO is described, which relies on the efficient graph based transformation of the relational learning task into a feature based problem, which can be solved with classical feature based methods like CAL3 or ID3. Using INDIGO as an example, it can be seen that graph theoretical approaches can make an important contribution to relational machine learning, that cannot be achieved in the same way with logical methods stemming from the field of ILP. Therefore, one concern of this work is the comparison of graph theoretical and logical learning methods. This is done by describing the hybrid learning system TRITOP that uses graph theoretical methods together with a logical notation and ILP concepts like subsumtion of clauses. TRITOP uses the so called α-subsumption, a restriction of the well known θ-subsumption. As an extension of the two learning systems, we also discuss the efficient construction of class prototypes using a neual network.
Learning recursive rules and inventing predicates are difficult tasks for inductive logic programming techniques. We propose an approach where given a set of examples and counter-examples, and a background knowledge, ...
详细信息
Learning recursive rules and inventing predicates are difficult tasks for inductive logic programming techniques. We propose an approach where given a set of examples and counter-examples, and a background knowledge, a human expert must propose constructive rules in order to parse the examples. These rules are used to associate with each example (or counter-example) a tree. Through type inference each tree is transformed into a many-sorted term. These are then used as input for a grammatical inference algorithm that returns a deterministic tree automaton. The automaton is finally combined with the expert knowledge in order to obtain a logic program for the concept described by the examples. We report in this paper the general construction of GIFT, its main algorithms, argue the necessity of the human expert, and show how it performs on some benchmarks.
In Czech corpora compound verb groups are usually tagged in word-by-word manner. As a consequence, some of the morphological tags of particular components of the verb group lose their original meaning. We present a me...
详细信息
In Czech corpora compound verb groups are usually tagged in word-by-word manner. As a consequence, some of the morphological tags of particular components of the verb group lose their original meaning. We present a method for automatic recognition of compound verb groups in Czech. From an annotated corpus 126 definite clause grammar rules were constructed. These rules describe all compound verb groups that are frequent in Czech. Using those rules we can find compound verb groups in unannotated texts with the accuracy 93%. Tagging compound verb groups in an annotated corpus exploiting the verb rules is described.
in this paper a learning system is presented which integrates an ECG waveform classifier (called PECG) with an interactive learner (called IMPUT). The PECG system is based on an attribute grammar specification of ECGs...
详细信息
ISBN:
(纸本)354062709X
in this paper a learning system is presented which integrates an ECG waveform classifier (called PECG) with an interactive learner (called IMPUT). The PECG system is based on an attribute grammar specification of ECGs that has been transformed to Prolog. The IMPUT system combines the interactive debugging technique IDT with the unfolding algorithm introduced in SPECTRE. Using the IMPUT system we can effectively assist in preparing the correct description of the basic structures of ECG waveforms.(4)
A knowledge-based system uses its database (also known as its "theory") to produce answers to the queries it receives. Unfortunately, these answers may be incorrect if the underlying theory is faulty. Standa...
详细信息
A knowledge-based system uses its database (also known as its "theory") to produce answers to the queries it receives. Unfortunately, these answers may be incorrect if the underlying theory is faulty. Standard "theory revision" systems use a given set of "labeled queries" (each a query paired with its correct answer) to transform the given theory, by adding and/or deleting either rules and/or antecedents, into a related theory that is as accurate as possible. After formally defining the theory revision task, this paper provides both sample and computational complexity bounds for this process. It first specifies the number of labeled queries necessary to identify a revised theory whose error is close to minimal with high probability. It then considers the computational complexity of finding this best theory, and proves that, unless P = NP, no polynomial-time algorithm can identify this optimal revision, even given the exact distribution of queries, except in certain simple situations. It also shows that, except in such simple situations, no polynomial-time algorithm can produce a theory whose error is even close to (i.e., within a particular polynomial factor of) optimal. The first (sample complexity) results suggest reasons why theory revision can be more effective than learning from scratch, while the second (computational complexity) results explain many aspects of the standard theory revision systems, including the practice of hill-climbing to a locally-optimal theory, based on a given set of labeled queries. (C) 1999 Elsevier Science B.V. All rights reserved.
The multiscalar architecture advocates a distributed processor organization and task-level speculation to exploit high degrees of instruction level parallelism (ILP) in sequential programs without impeding improvement...
详细信息
The multiscalar architecture advocates a distributed processor organization and task-level speculation to exploit high degrees of instruction level parallelism (ILP) in sequential programs without impeding improvements in clock speeds. The main goal of this paper is to understand the key implications of the architectural features of distributed processor organization and task-level speculation for compiler task selection from the point of view of performance. We identify the fundamental performance issues to be: control flow speculation, data communication, data dependence speculation, load imbalance, and task overhead. We show that these issues are intimately related to a few key characteristics of tasks: task size, intertask control flow, and intertask data dependence. We describe compiler heuristics to select tasks with favorable characteristics. We report experimental results to show that the heuristics are successful in boosting overall performance by establishing larger ILP windows. We also present a breakdown of execution times to show that register wait, load imbalance, control flow squash, and conventional pipeline losses are significant for almost all the SPEC95 benchmarks. (C) 1999 Academic Press.
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its fre...
详细信息
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
One of the most important issues in instruction-level parallelism (ILP) processors involves the boosting of instructions across conditional branches for speculative execution. A compiler scheduling technique named LES...
详细信息
One of the most important issues in instruction-level parallelism (ILP) processors involves the boosting of instructions across conditional branches for speculative execution. A compiler scheduling technique named LESS with a renaming function is proposed for the elimination of hazards that incorrectly overwrite a value when the branch is incorrectly predicted during speculative execution. The hardware implementation for this method is relatively simple and rather efficient. Simulation results show that the speedups achieved by LESS are better than other existing methods. For example, under the superscalar execution model, with an issue rate of 8, the average performance improvement by LESS can be expected to be 13% better than that of the CRF scheme, a solution reported recently with a scheduling skeleton similar to LESS.
This paper is a study of the problem of relevance in inductive concept learning. It gives definitions of irrelevant literals and irrelevant examples and presents efficient algorithms that enable their elimination. The...
详细信息
This paper is a study of the problem of relevance in inductive concept learning. It gives definitions of irrelevant literals and irrelevant examples and presents efficient algorithms that enable their elimination. The proposed approach is directly applicable in propositional learning and in relation learning tasks that can be solved using a LINUS transformation approach. A simple inductive logic programming (ILP) problem is used to illustrate the approach to irrelevant literal and example elimination. Results of utility studies show the usefulness of literal reduction applied in LINUS and in the search of refinement graphs. (C) 1999 Elsevier Science Inc. All rights reserved.
作者:
Taylor, KCSIRO
Math & Informat Sci Canberra ACT 2601 Australia
Absorption is one of the so-called inverse resolution operators of inductive logic programming. The paper studies the properties of absorption that make it suitable for incremental generalization of definite clauses u...
详细信息
Absorption is one of the so-called inverse resolution operators of inductive logic programming. The paper studies the properties of absorption that make it suitable for incremental generalization of definite clauses using background knowledge represented by a definite program. The soundness and completeness of the operator are established according to Buntine's model of generalization called generalized subsumption. The completeness argument proceeds by viewing absorption as the inversion of SLD-resolution. In addition, some simplifying techniques are introduced for reducing the non-determinism inherent in usual presentations of absorption. The effect of these simplifications on completeness is discussed. (C) 1999 Elsevier Science Inc. All rights reserved.
暂无评论