inductive logic programming is a subfield of machine learning that uses first-order logic as a uniform representation for examples and hypothesis. In its core form, it deals with the problem of finding a hypothesis th...
详细信息
ISBN:
(纸本)9783642253232;9783642253249
inductive logic programming is a subfield of machine learning that uses first-order logic as a uniform representation for examples and hypothesis. In its core form, it deals with the problem of finding a hypothesis that covers all positive examples and excludes all negative examples. The coverage test and the method to obtain a hypothesis from a given template have been efficiently implemented using constraint satisfaction techniques. In this paper we suggest a method how to efficiently generate the template by remembering a history of generated templates and using this history when adding predicates to a new candidate template. This method significantly outperforms the existing method based on brute-force incremental extension of the template.
Meta-Interpretive Learning (MIL) learns logic programs from examples by instantiating meta-rules, which is implemented by the Metagol system based on Prolog. Viewing MIL-problems as combinatorial search problems, they...
详细信息
Meta-Interpretive Learning (MIL) learns logic programs from examples by instantiating meta-rules, which is implemented by the Metagol system based on Prolog. Viewing MIL-problems as combinatorial search problems, they can alternatively be solved by employing Answer Set programming (ASP), which may result in performance gains as a result of efficient conflict propagation. However, a straightforward ASP-encoding of MIL results in a huge search space due to a lack of procedural bias and the need for grounding. To address these challenging issues, we encode MIL in the HEX-formalism, which is an extension of ASP that allows us to outsource the background knowledge, and we restrict the search space to compensate for a procedural bias in ASP. This way, the import of constants from the background knowledge can for a given type of meta-rules be limited to relevant ones. Moreover, by abstracting from term manipulations in the encoding and by exploiting the HEX interface mechanism, the import of such constants can be entirely avoided in order to mitigate the grounding bottleneck. An experimental evaluation shows promising results.
As we know, table data is a popular data form in industry and scientific research fields. However, sometimes the original table data could not meet updating requirements in real applications, so we need to convert the...
详细信息
ISBN:
(纸本)9781538637821
As we know, table data is a popular data form in industry and scientific research fields. However, sometimes the original table data could not meet updating requirements in real applications, so we need to convert them into required form. In this paper we propose an approach to learn the transformation rules that convert original table data to target form. Based on inductive logic programming(ILP), we design a learning system called Table Transformation Rule Learner (TTRL). It uses specific predicates and background knowledge for this task to generate table transformation rules. We implement a unique heuristic function (HF) in TTRL to accelerate searching process for rule generation, and we use semi-supervised learning (SSL) in order to obtain more information especially from small set of sample data. We also address the problem like over-generalization which may occur when having only positive training examples in ILP learning process. We test our program in several kinds of table data, and the result shows that the transformation rules can be learned correctly. Moreover, our designed searching strategy can greatly reduce the time cost of searching rules.
Learning from interpretation transition (LFIT) automatically constructs a model of the dynamics of a system from the observation of its state transitions. So far, the systems that LFIT handles are restricted to synchr...
详细信息
ISBN:
(数字)9783319999609
ISBN:
(纸本)9783319999609;9783319999593
Learning from interpretation transition (LFIT) automatically constructs a model of the dynamics of a system from the observation of its state transitions. So far, the systems that LFIT handles are restricted to synchronous deterministic dynamics, i.e., all variables update their values at the same time and, for each state of the system, there is only one possible next state. However, other dynamics exist in the field of logical modeling, in particular the asynchronous semantics which is widely used to model biological systems. In this paper, we focus on a method that learns the dynamics of the system independently of its semantics. For this purpose, we propose a modeling of multi-valued systems as logic programs in which a rule represents what can occur rather than what will occur. This modeling allows us to represent non-determinism and to propose an extension of LFIT in the form of a semantics free algorithm to learn from discrete multi-valued transitions, regardless of their update schemes. We show through theoretical results that synchronous, asynchronous and general semantics are all captured by this method. Practical evaluation is performed on randomly generated systems and benchmarks from biological literature to study the scalability of this new algorithm regarding the three aforementioned semantics.
Preference learning (PL) plays an important role in machine learning research and practice. PL works with an ordinal dataset, used frequently in areas such as behavioural science, medical science, education, psycholog...
详细信息
Preference learning (PL) plays an important role in machine learning research and practice. PL works with an ordinal dataset, used frequently in areas such as behavioural science, medical science, education, psychology and social science. The aim of PL is to predict the preference for a new set of items based on the training data. In the application area of Recommender Systems (RSs), PL is used as an important element to produce good recommendations. Many ideas have been developed to build better recommendation techniques. One of the challenges in RSs is how to develop systems that are proactive and unobtrusive. To address this problem, we have studied the use of pairwise comparisons in preference elicitation as a very simple way of expressing preferences. Research in PL has also discovered this kind of representation and considers it to be learning from binary relations. There are three contributions in this thesis: The first and the most significant contribution is a new approach based on inductive logic programming (ILP) in Description logics (DL) representation to learn the relation of order. The second contribution is a strategy based on Active Learning (AL) to support the inference process and make choices more informative for learning purposes. A third contribution is a recommender system algorithm based on the ILP in DL approach, implemented in a real-world recommender system with a large used-car dataset. The proposed approach has been evaluated by using both offline and online experiments. The offline experiments were performed using two publicly available preference datasets, while the online experiment was conducted using 24 participants to evaluate the system. In the offline experiments, the overall accuracy of our proposed approach outperformed the other 3 baseline algorithms, SVM, Decision Tree and Aleph. In the online experiment, the user study also showed some satisfactory results in which our proposed pairwise comparisons interface in a recommender
inductive logic programming (ILP) deals with the problem of finding a hypothesis covering given positive examples and excluding negative examples. It is a subfield of machine learning that uses first-order logic as a ...
详细信息
ISBN:
(纸本)9780769545967
inductive logic programming (ILP) deals with the problem of finding a hypothesis covering given positive examples and excluding negative examples. It is a subfield of machine learning that uses first-order logic as a uniform representation for examples and hypothesis. In this paper we propose a method to boost given ILP learning algorithm by first decomposing the set of examples to subsets and applying the learning algorithm to each subset separately, second, merging the hypotheses obtained for subsets to get a single hypothesis for the complete set of examples, and finally refining this single hypothesis to make it shorter.
inductive logic programming is a discipline investigating invention of clausal theories from observed examples such that for given evidence and background knowledge we are finding a hypothesis covering all positive ex...
详细信息
inductive logic programming is a discipline investigating invention of clausal theories from observed examples such that for given evidence and background knowledge we are finding a hypothesis covering all positive examples and excluding all negative ones. In this thesis we extend an existing work on template consistency to general consistency. We present a three-phase algorithm DeMeR decomposing the original problem into smaller subtasks, merging all subsolutions together yielding a complete solution and finally refining the result in order to get a compact final hypothesis. Furthermore, we focus on a method how each individual subtask is solved and we introduce a generate-and-test method based on the probabilistic history-driven approach for this purpose. We analyze each stage of the proposed algorithms and demonstrate its impact on a runtime and a hypothesis structure. In particular, we show that the first phase of the algorithm concentrates on solving the problem quickly at the cost of longer solutions whereas the other phases refine these solutions into an admissible form. Finally, we prove that our technique outperforms other algorithms by comparing its results for identifying common structures in random graphs to existing systems.
The theta-subsumption test is known to be a bottleneck in inductive logic programming. The state-of-the-art learning systems in this field are hardly scalable. Last year, we have created a distributed theta-subsumptio...
详细信息
The theta-subsumption test is known to be a bottleneck in inductive logic programming. The state-of-the-art learning systems in this field are hardly scalable. Last year, we have created a distributed theta-subsumption process based on an Actor Model, with the aim of being able to decide subsumption on very large clauses. This model was correct and complete, but was also very slow. This is why we introduce ANTS (Actor Network based Theta-Subsumption), a new model also based on an actor network, which is significantly faster than the previous one.
Theory Revision from Examples is the process of repairing incorrect theories and/or improving incomplete theories from a set of examples. This process usually results in more accurate and comprehensible theories than ...
详细信息
Theory Revision from Examples is the process of repairing incorrect theories and/or improving incomplete theories from a set of examples. This process usually results in more accurate and comprehensible theories than purely inductive learning. However, so far, progress on the use of theory revision techniques has been limited by the large search space they yield. In this article, we argue that it is possible to reduce the search space of a theory revision system by introducing stochastic local search. More precisely, we introduce a number of stochastic local search components at the key steps of the revision process, and implement them on a state-of-the-art revision system that makes use of the most specific clause to constrain the search space. We show that with the use of these SLS techniques it is possible for the revision system to be executed in a feasible time, while still improving the initial theory and in a number of cases even reaching better accuracies than the deterministic revision process. Moreover, in some cases the revision process can be faster and still achieve better accuracies than an ILP system learning from an empty initial hypothesis or assuming an initial theory to be correct.
Background: Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all posi...
详细信息
Background: Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physicochemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, inductive logic programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM). Results: We use the SCOP database to perform our experiments by evaluating protein recognition within the same superfamily. Our results show that our methodology when using SVM performs significantly better than some of the state of the art methods, and comparable to other. However, our method provides a comprehensible set of logical rules that can help to understand what determines a protein function. Conclusions: The strategy of selecting only the most frequent patterns is effective for the remote homology detection. This is possible through a suitable first-order logical representation of homologous properties, and through a set of frequent patterns, found by an ILP system, that summarizes essential features of protein functions.
暂无评论