This is a review paper, whose goal is to significantly improve our understanding of the crucial role of attribute interaction in data mining. The main contributions of this paper are as follows. Firstly, we show that ...
详细信息
This is a review paper, whose goal is to significantly improve our understanding of the crucial role of attribute interaction in data mining. The main contributions of this paper are as follows. Firstly, we show that the concept of attribute interaction has a crucial role across different kinds of problem in data mining, such as attribute construction, coping with small disjuncts, induction of first-order logic rules, detection of Simpson's paradox, and finding several types of interesting rules. Hence, a better understanding of attribute interaction can lead to a better understanding of the relationship between these kinds of problems, which are usually studied separately from each other. Secondly, we draw attention to the fact that most rule induction algorithms are based on a greedy search which does not cope well with the problem of attribute interaction, and point out some alternative kinds of rule discovery methods which tend to cope better with this problem. Thirdly, we discussed several algorithms and methods for discovering interesting knowledge that, implicitly or explicitly, are based on the concept of attribute interaction.
Relational reinforcement learning is presented, a learning technique that combines reinforcement learning with relational learning or inductive logic programming. Due to the use of a more expressive representation lan...
详细信息
Relational reinforcement learning is presented, a learning technique that combines reinforcement learning with relational learning or inductive logic programming. Due to the use of a more expressive representation language to represent states, actions and Q-functions, relational reinforcement learning can be potentially applied to a new range of learning tasks. One such task that we investigate is planning in the blocks world, where it is assumed that the effects of the actions are unknown to the agent and the agent has to learn a policy. Within this simple domain we show that relational reinforcement learning solves some existing problems with reinforcement learning. In particular, relational reinforcement learning allows us to employ structural representations, to abstract from specific goals pursued and to exploit the results of previous learning phases when addressing new (more complex) situations.
作者:
Horváth, TTurán, CGMD AiS
Inst Autonomous Intelligent Syst German Natl Res Ctr Informat Technol D-53754 St Augustin Germany Univ Illinois
Dept Math Stat & Comp Sci Chicago IL 60607 USA Hungarian Acad Sci
Res Grp Artificial Intelligence Szeged Hungary
The efficient learnability of restricted classes of logic programs is studied in the PAC framework of computational learning theory, We develop the product homomorphism method, which gives polynomial PAC learning algo...
详细信息
The efficient learnability of restricted classes of logic programs is studied in the PAC framework of computational learning theory, We develop the product homomorphism method, which gives polynomial PAC learning algorithms for a nonrecursive Horn clause with function-free ground background knowledge, if the background knowledge satisfies some structural properties. The method is based on a characterization of the concept that corresponds to the relative least general generalization of a set of positive examples with respect to the background knowledge. The characterization is formulated in terms of products and homomorphisms. In the applications this characterization is turned into an explicit combinatorial description, which is then translated into the language of nonrecursive Horn clauses, We show that a nonrecursive Horn clause is polynomially PAC-learnable if there is a single binary background predicate and the ground atoms in the background knowledge form a forest. If the ground atoms in the background knowledge form a disjoint union of cycles then the situation is different, as the shortest consistent hypothesis may have exponential size. In this case polynomial PAC-learnability holds if a different representation language is used. We also consider the complexity of hypothesis finding for multiple clauses in some restricted cases. (C) 2001 Elsevier Science B,V. All rights reserved.
This paper presents a case study of a machine-aided knowledge discovery process within the general area of drug design. Within drug design, the particular problem of pharmacophore discovery is isolated, and the Induct...
详细信息
This paper presents a case study of a machine-aided knowledge discovery process within the general area of drug design. Within drug design, the particular problem of pharmacophore discovery is isolated, and the inductive logic programming (ILP) system PROGOL is applied to the problem of identifying potential pharmacophores for ACE inhibition. The case study reported in this paper supports four general lessons for machine learning and knowledge discovery, as well as more specific lessons for pharmacophore discovery, for inductive logic programming, and for ACE inhibition. The general lessons for machine learning and knowledge discovery are as follows. 1. An initial rediscovery step is a useful tool when approaching a new application domain. 2. General machine learning heuristics may fail to match the derails of an application domain, but it may be possible to successfully apply a heuristic-based algorithm in spite of the mismatch. 3. A complete search for all plausible hypotheses can provide useful information to a user, although experimentation may be required to choose between competing hypotheses. 4. A declarative knowledge representation facilitates the development and debugging of background knowledge in collaboration with a domain expert, as well as the communication of final results.
By analysing sequences of actions performed by a user, one can find frequent subsequences that can be suggested as macro (script) definitions. However, often these 'actions' have additional features. In this p...
详细信息
ISBN:
(纸本)3540423257
By analysing sequences of actions performed by a user, one can find frequent subsequences that can be suggested as macro (script) definitions. However, often these 'actions' have additional features. In this paper we combine an algorithm to detect frequent subsequences with an inductive logic programming system to automatically generate for each frequent subsequence the most specific 'template' for these additional features that is consistent with the observed frequent subsequences. The resulting system is implemented and used in an application where we automatically generate macros from logs of the use of a Unix command shell.
In this paper, we present a learning simulator consisting of an interface, an inference engine and an inductive logic programming(ILP) system. Possible usage of the simulator includes to check the behavior of CAI syst...
详细信息
ISBN:
(纸本)0780371011
In this paper, we present a learning simulator consisting of an interface, an inference engine and an inductive logic programming(ILP) system. Possible usage of the simulator includes to check the behavior of CAI systems, to be adopted as a novice agent in CAI systems, and for a teacher to check contents for study by observing the response of the simulator. The learning simulator learns interactively. First, a teacher ( or a CAI system) gives background knowledge as basic rules and examples. Next, a teacher asks the simulator some question. Using the background knowledge, rules generated by ILP and examples stored in the memory, the simulator answers the question. Then, the simulator stores the examples and updates rules for the next question, after the teacher tells the correct answer. We implement the simulator, using a Prolog interpreter and ILP system FOIL. We show learning results obtained through computer simulations.
This paper demonstrates the capabilities of FOIDL, an inductive logic programming (ILP) system whose distinguishing characteristics are the ability to produce first-order decision lists, the use of an output completen...
详细信息
This paper demonstrates the capabilities of FOIDL, an inductive logic programming (ILP) system whose distinguishing characteristics are the ability to produce first-order decision lists, the use of an output completeness assumption as a substitute for negative examples, and the use of intensional background knowledge. The development of FOIDL was originally motivated by the problem of learning to generate the past tense of English verbs;however, this paper demonstrates its superior performance on two different sets of benchmark ILP problems. Tests on the finite element mesh design problem show that FOIDL's decision lists enable it to produce generally more accurate results than a range of methods previously applied to this problem. Tests with a selection of list-processing problems from Bratko's introductory Prolog text demonstrate that the combination of implicit negatives and intensionality allow FOIDL to learn correct programs from far fewer examples than FOIL.
This paper addresses an important application of machine learning (ML) in design. One of the major bottlenecks in the process of engineering analysis by using the finite-element method-a design of the finite-element m...
详细信息
This paper addresses an important application of machine learning (ML) in design. One of the major bottlenecks in the process of engineering analysis by using the finite-element method-a design of the finite-element mesh-was a subject of improvement. Defining an appropriate geometric mesh model that ensures low approximation errors and avoids unnecessary computational overhead is a very difficult and time-consuming task based mainly on the user's experience. A knowledge base for finite-element mesh design has been constructed using the ML techniques. Ten mesh models have been used as a source of training examples. The mesh dataset was probably the first real-world relational dataset and became one of the most widely used training set for experimenting with inductive logic programming (ILP) systems. After several experiments with different ML systems in the last few years, the ILP system CLAUDIEN was chosen to construct the rules for determining the appropriate mesh resolution values. The ILP has been found to be an effective approach to the problem of mesh design. An evaluation of the resulting knowledge base shows that the mesh design patterns are captured well by the induced rules and represent a solid basis for practical application. The aim of this paper is not only to present the real-life ML application to design, but also to describe and discuss a relation of the work being done to the topic of this special issue: the proposed "dimensions" of ML in design.
We present a new method for discovering knowledge from structured data which are represented Ly graphs in the framework of inductive logic programming. A graph, or network, is widely used for representing relations be...
详细信息
We present a new method for discovering knowledge from structured data which are represented Ly graphs in the framework of inductive logic programming. A graph, or network, is widely used for representing relations between various data and expressing a small and easily understandable hypothesis. The analyzing system directly manipulating graphs is useful for knowledge discovery. Our method uses Formal Graph System (FGS) as a knowledge representation language for graph structured data. FGS is a kind of logicprogramming system which directly deals with graphs just like first order terms. And our method employs a refutably inductive inference algorithm as a learning algorithm. A refutably inductive inference algorithm is a special type of inductive inference algorithm with refutability of hypothesis spaces. and is suitable for knowledge discovery. We give a sufficiently large hypothesis space, the set of weakly reducing FGS programs. And we show that this hypothesis space is refutably inferable from complete data. We have designed and implemented a prototype of a knowledge discovery system KD-FGS, which is based on our method and acquires knowledge directly from graph structured data. Finally we discuss the applicability of our method for graph structured data with experimental results on some graph theoretical notions.
Character recognition systems can contribute tremendously to the advancement of the automation process and can improve the interaction between man and machine in many applications, including office automation, check v...
详细信息
Character recognition systems can contribute tremendously to the advancement of the automation process and can improve the interaction between man and machine in many applications, including office automation, check verification and a large variety of banking, business and data entry applications. The main theme of this paper is the automatic recognition of hand printed Arabic characters using machine learning. Conventional methods have relied on hand-constructed dictionaries which are tedious to construct and difficult to make tolerant to variation in writing styles. The advantages of machine learning are that it can generalize over the large degree of variation between writing styles and recognition rules can be constructed by example. The system was tested on a sample of handwritten characters from several individuals whose writing ranged from acceptable to poor in quality and the correct average recognition rate obtained using cross-validation was 89.65%.
暂无评论