This paper introduces a new logic-based method for optimising the selection of compiler flags on embedded architectures. In particular, we use inductive logic programming (ILP) to learn logical rules that relate effec...
详细信息
This paper introduces a new logic-based method for optimising the selection of compiler flags on embedded architectures. In particular, we use inductive logic programming (ILP) to learn logical rules that relate effective compiler flags to specific program features. Unlike earlier work, we aim to infer human-readable rules and we seek to develop a relational first-order approach which automatically discovers relevant features rather than relying on a vector of predetermined attributes. To this end we generated a data set by measuring execution times of 60 benchmarks on an embedded system development board and we developed an ILP prototype which outperforms the current state-of-the-art learning approach in 34 of the 60 benchmarks. Finally, we combined the strengths of the current state of the art and our ILP method in a hybrid approach which reduced execution times by an average of 8% and up to 50% in some cases.
A central problem in inductive logic programming is theory evaluation. Without some sort of preference criterion, any two theories that explain a set of examples are equally acceptable. This paper presents a scheme fo...
详细信息
A central problem in inductive logic programming is theory evaluation. Without some sort of preference criterion, any two theories that explain a set of examples are equally acceptable. This paper presents a scheme for evaluating alternative inductive theories based on an objective preference criterion. It strives to extract maximal redundancy from examples, transforming structure into randomness. A major strength of the method is its application to learning problems where negative examples of concepts are scarce or unavailable. A new measure called model complexity is introduced, and its use is illustrated and compared with a proof complexity measure on relational learning tasks. The complementarity of model and proof complexity parallels that of model and proof-theoretic semantics. Model complexity, where applicable, seems to be an appropriate measure for evaluating inductivelogic theories.
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is it...
详细信息
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. We identify a number of parameters that influence the obtainable speedups, and validate and refine our analysis with experiments on a variety of data sets with two different implementations. Besides cross-validation, we also briefly explore the usefulness of these techniques for bagging. We conclude with some guidelines concerning when these optimizations should be considered.
The degree of similarity between sentences is assessed by sentence similarity methods. Sentence similarity methods play an important role in areas such as summarization, search, and categorization of texts, machine tr...
详细信息
The degree of similarity between sentences is assessed by sentence similarity methods. Sentence similarity methods play an important role in areas such as summarization, search, and categorization of texts, machine translation, etc. The current methods for assessing sentence similarity are based only on the similarity between the words in the sentences. Such methods either represent sentences as bag of words vectors or are restricted to the syntactic information of the sentences. Two important problems in language understanding are not addressed by such strategies: the word order and the meaning of the sentence as a whole. The new sentence similarity assessment measure presented here largely improves and refines a recently published method that takes into account the lexical, syntactic and semantic components of sentences. The new method was benchmarked using Li-McLean, showing that it outperforms the state of the art systems and achieves results comparable to the evaluation made by humans. Besides that, the method proposed was extensively tested using the SemEval 2012 sentence similarity test set and in the evaluation of the degree of similarity between summaries using the CNN-corpus. In both cases, the measure proposed here was proved effective and useful. (C) 2016 Elsevier Ltd. All rights reserved.
We propose a novel framework for learning normal logic programs from transitions of interpretations. Given a set of pairs of interpretations (I,J) such that J=T (P) (I), where T (P) is the immediate consequence operat...
详细信息
We propose a novel framework for learning normal logic programs from transitions of interpretations. Given a set of pairs of interpretations (I,J) such that J=T (P) (I), where T (P) is the immediate consequence operator, we infer the program P. The learning framework can be repeatedly applied for identifying Boolean networks from basins of attraction. Two algorithms have been implemented for this learning task, and are compared using examples from the biological literature. We also show how to incorporate background knowledge and inductive biases, then apply the framework to learning transition rules of cellular automata.
General game playing (GGP) is a framework for evaluating an agent's general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) t...
详细信息
General game playing (GGP) is a framework for evaluating an agent's general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over all the games. In this paper, we invert this task: a learner is given game traces and the task is to learn the rules that could produce the traces. This problem is central toinductive general game playing(IGGP). We introduce a technique that automatically generates IGGP tasks from GGP games. We introduce an IGGP dataset which contains traces from 50 diverse games, such asSudoku,Sokoban, andCheckers. We claim that IGGP is difficult for existing inductive logic programming (ILP) approaches. To support this claim, we evaluate existing ILP systems on our dataset. Our empirical results show that most of the games cannot be correctly learned by existing systems. The best performing system solves only 40% of the tasks perfectly. Our results suggest that IGGP poses many challenges to existing approaches. Furthermore, because we can automatically generate IGGP tasks from GGP games, our dataset will continue to grow with the GGP competition, as new games are added every year. We therefore think that the IGGP problem and dataset will be valuable for motivating and evaluating future research.
Many forms of inductive logic programming (ILP) usemetarules, second-order Horn clauses, to define the structure of learnable programs and thus the hypothesis space. Deciding which metarules to use for a given learnin...
详细信息
Many forms of inductive logic programming (ILP) usemetarules, second-order Horn clauses, to define the structure of learnable programs and thus the hypothesis space. Deciding which metarules to use for a given learning task is a major open problem and is a trade-off between efficiency and expressivity: the hypothesis space grows given more metarules, so we wish to use fewer metarules, but if we use too few metarules then we lose expressivity. In this paper, we study whether fragments of metarules can be logically reduced to minimal finite subsets. We consider two traditional forms of logical reduction: subsumption and entailment. We also consider a new reduction technique calledderivation reduction, which is based on SLD-resolution. We compute reduced sets of metarules for fragments relevant to ILP and theoretically show whether these reduced sets are reductions for more general infinite fragments. We experimentally compare learning with reduced sets of metarules on three domains: Michalski trains, string transformations, and game rules. In general, derivation reduced sets of metarules outperform subsumption and entailment reduced sets, both in terms of predictive accuracies and learning times.
The least general generalization (LGG) of strings may cause an over-generalization in the generalization process of the clauses of predicates with string arguments. We propose a specific generalization (SG) for string...
详细信息
The least general generalization (LGG) of strings may cause an over-generalization in the generalization process of the clauses of predicates with string arguments. We propose a specific generalization (SG) for strings to reduce over-generalization. SGs of strings are used in the generalization of a set of strings representing the arguments of a set of positive examples of a predicate with string arguments. In order to create a SG of two strings, first, a unique match sequence between these strings is found. A unique match sequence of two strings consists of similarities and differences to represent similar parts and differing parts between those strings. The differences in the unique match sequence are replaced to create a SG of those strings. In the generalization process, a coverage algorithm based on SGs of strings or learning heuristics based on match sequences are used.
作者:
Yamamoto, AHokkaido Univ
Div Elect & Informat Engn Kita Ku Sapporo Hokkaido 0608628 Japan Hokkaido Univ
Meme Media Lab Kita Ku Sapporo Hokkaido 0608628 Japan
We propose in this paper an inference method called Bottom Generalization for inductive logic programming (ILP, for short). We give an inference procedure based on it, and prove that a hypothesis clause H is derived b...
详细信息
We propose in this paper an inference method called Bottom Generalization for inductive logic programming (ILP, for short). We give an inference procedure based on it, and prove that a hypothesis clause H is derived by the procedure from an example E under a background theory B iff H subsumes E relative to B in Plotkin's sense. The theory B can be any clausal theory, and the example E can be any clause which is not implied by B. The derived hypothesis H is a clause, but is not always definite. The result is proved by defining a declarative semantics for arbitrary consistent clausal theories, and showing that SE-resolution, which was originally introduced by Plotkin, gives their complete procedural semantics. We also show that Bottom Generalization is more powerful than both Jung's method based on the V-operator and Saturant Generalization by Rouveirol, but not than Inverse Entailment by Muggleton. At the ILP '97 workshop we called our inference method "Inverse Entailment," but we have renamed it "Bottom Generalization" because we found that it differs from the original definition of Inverse Entailment.
In this paper we study the problem of classification of textual web reports. We are specifically focused on situations in which structured information extracted from the reports is used for classification. We present ...
详细信息
In this paper we study the problem of classification of textual web reports. We are specifically focused on situations in which structured information extracted from the reports is used for classification. We present an experimental classification system based on usage of third party linguistic analyzers, our previous work on web information extraction, and fuzzy inductive logic programming (fuzzy ILP). A detailed study of the so-called 'Fuzzy ILP Classifier' is the main contribution of the paper. The study includes formal models, prototype implementation, extensive evaluation experiments and comparison of the classifier with other alternatives like decision trees, support vector machines, neural networks, etc. (C) 2011 Elsevier Ltd. All rights reserved.
暂无评论