In this paper, we present a modular methodology that combines state-of-the-art methods in (stochastic) machine learning with well-established methods in inductive logic programming (ILP) and rule induction to provide ...
详细信息
In this paper, we present a modular methodology that combines state-of-the-art methods in (stochastic) machine learning with well-established methods in inductive logic programming (ILP) and rule induction to provide efficient and scalable algorithms for the classification of vast data sets. By construction, these classifications are based on the synthesis of simple rules, thus providing direct explanations of the obtained classifications. Apart from evaluating our approach on the common large scale data sets MNIST, Fashion-MNIST and IMDB, we present novel results on explainable classifications of dental bills. The latter case study stems from an industrial collaboration with Allianz Private Krankenversicherung which is an insurance company offering diverse services in Germany.
This article presents MP-SPILDL, a massively parallel inductivelogic learner in Description logic (DL). MP-SPILDL is a scalable inductive logic programming (ILP) algorithm that exploits existing Big Data infrastructu...
详细信息
This article presents MP-SPILDL, a massively parallel inductivelogic learner in Description logic (DL). MP-SPILDL is a scalable inductive logic programming (ILP) algorithm that exploits existing Big Data infrastructure to perform large-scale inductivelogic learning in DL (the ALCQI((D)) DL language in particular). MP-SPILDL targets accelerating both hypothesis search and hypothesis evaluation by aggregating the computing power of multi-core CPUs with their vector/SIMD instructions and multi-GPUs in a Hadoop cluster. In terms of hypothesis search, MP-SPILDL employs a novel MapReduce-based algorithm that performs distributed parallel hypothesis search. MP-SPILDL also employs a novel MapReduce-based procedure that eliminates all redundant hypotheses generated after each learning iteration. Moreover, MP-SPILDL utilizes deterministic ordering of hypotheses' operands to avoid exploring redundant areas of the search space, similar to the DL-Learner, the state of the art in DL-based ILP literature. In terms of hypothesis evaluation, MP-SPILDL performs parallel hypothesis evaluation, which uses all CPU cores combined with their vector instructions and all multi-GPUs of all machines in the Hadoop cluster. According to the experimental results using an Apache Spark implementation on a Hadoop cluster of three worker machines (36 total CPU cores, 7 total GPUs), MP-SPILDL achieved speedups of up to 13.3 folds using parallel beam search with $beamWidth = 32 and CPU-based vectorized hypothesis evaluation - the best-case scenario. On small datasets such as Michalski's trains, MP-SPILDL achieved a slower performance than the baseline, representing the worst-case scenario.
Visual reasoning is essential for building intelligent agents that understand the world and perform problem-solving beyond perception. Differentiable forward reasoning has been developed to integrate reasoning with gr...
详细信息
Visual reasoning is essential for building intelligent agents that understand the world and perform problem-solving beyond perception. Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms. However, due to the memory intensity, most existing approaches do not bring the best of the expressivity of first-order logic, excluding a crucial ability to solve abstract visual reasoning, where agents need to perform reasoning by using analogies on abstract concepts in different scenarios. To overcome this problem, we propose NEUro-symbolic Message-pAssiNg reasoNer (NEUMANN), which is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors. Moreover, we propose a computationally-efficient structure learning algorithm to perform explanatory program induction on complex visual scenes. To evaluate, in addition to conventional visual reasoning tasks, we propose a new task, visual reasoning behind-the-scenes, where agents need to learn abstract programs and then answer queries by imagining scenes that are not observed. We empirically demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work,...
详细信息
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://***/ kd4v/.
Relevant information extraction from text and web pages in particular is an intensive and time-consuming task that needs important semantic resources. Thus, to be efficient, automatic information extraction systems ha...
详细信息
ISBN:
(纸本)9781479929719
Relevant information extraction from text and web pages in particular is an intensive and time-consuming task that needs important semantic resources. Thus, to be efficient, automatic information extraction systems have to exploit semantic resources (or ontologies) and employ machine-learning techniques to make them more adaptive. This paper presents an Ontology-based Information Extraction method using inductive logic programming that allows inducing symbolic predicates expressed in Horn clausal logic that subsume information extraction rules. Such rules allow the system to extract class and relation instances from English corpora for ontology population purposes. Several experiments were conducted and preliminary experimental results are promising, showing that the proposed approach improves previous work over extracting instances of classes and relations, either separately or altogether.
logic-based machine learning aims to learn general, interpretable knowledge in a data-efficient manner. However, labelled data must be specified in a structured logical form. To address this limitation, we propose a n...
详细信息
logic-based machine learning aims to learn general, interpretable knowledge in a data-efficient manner. However, labelled data must be specified in a structured logical form. To address this limitation, we propose a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FFNSL), that integrates a logic-based machine learning system capable of learning from noisy examples, with neural networks, in order to learn interpretable knowledge from labelled unstructured data. We demonstrate the generality of FFNSL on four neural-symbolic classification problems, where different pre-trained neural network models and logic-based machine learning systems are integrated to learn interpretable knowledge from sequences of images. We evaluate the robustness of our framework by using images subject to distributional shifts, for which the pre-trained neural networks may predict incorrectly and with high confidence. We analyse the impact that these shifts have on the accuracy of the learned knowledge and run-time performance, comparing FFNSL to tree-based and pure neural approaches. Our experimental results show that FFNSL outperforms the baselines by learning more accurate and interpretable knowledge with fewer examples.
Autonomous robots start to be integrated in human environments where explicit and implicit social norms guide the behavior of all agents. To assure safety and predictability, these artificial agents should act in acco...
详细信息
Autonomous robots start to be integrated in human environments where explicit and implicit social norms guide the behavior of all agents. To assure safety and predictability, these artificial agents should act in accordance with the applicable social norms. However, it is not straightforward to define these rules and incorporate them in an agent's policy. Particularly because social norms are often implicit and environment specific. In this paper, we propose a novel iterative approach to extract a set of rules from observed human trajectories. This hybrid method combines the strengths of inverse reinforcement learning and inductive logic programming. We experimentally show how our method successfully induces a compact logic program which represents the behavioral constraints applicable in a Tower of Hanoi and a traffic simulator environment. The induced program is adopted as prior knowledge by a model-free reinforcement learning agent to speed up training and prevent any social norm violation during exploration and deployment. Moreover, expressing norms as a logic program provides improved interpretability, which is an important pillar in the design of safe artificial agents, as well as transferability to similar environments.
Scientists form hypotheses and experimentally test them. If a hypothesis fails (is refuted), scientists try to explain the failure to eliminate other hypotheses. The more precise the failure analysis the more hypothes...
详细信息
Scientists form hypotheses and experimentally test them. If a hypothesis fails (is refuted), scientists try to explain the failure to eliminate other hypotheses. The more precise the failure analysis the more hypotheses can be eliminated. Thus inspired, we introduce failure explanation techniques for inductive logic programming. Given a hypothesis represented as a logic program, we test it on examples. If a hypothesis fails, we explain the failure in terms of failing sub-programs. In case a positive example fails, we identify failing sub-programs at the granularity of literals. We introduce a failure explanation algorithm based on analysing branches of SLD-trees. We integrate a meta-interpreter based implementation of this algorithm with the test-stage of the Popper ILP system. We show that fine-grained failure analysis allows for learning fine-grained constraints on the hypothesis space. Our experimental results show that explaining failures can drastically reduce hypothesis space exploration and learning times.
This paper presents an approach to infer UI patterns existent in a web application. This reverse engineering process is performed in two steps. First, execution traces are collected from user interactions using the Se...
详细信息
ISBN:
(纸本)9789899843400
This paper presents an approach to infer UI patterns existent in a web application. This reverse engineering process is performed in two steps. First, execution traces are collected from user interactions using the Selenium software. Second, the existing UI patterns within those traces are identified using Machine Learning inference with the Aleph ILP system. The paper describes and illustrates the proposed methodology on a case study over the Amazon web site.
A magic value in a program is a constant symbol that is essential for the execution of the program but has no clear explanation for its choice. Learning programs with magic values is difficult for existing program syn...
详细信息
A magic value in a program is a constant symbol that is essential for the execution of the program but has no clear explanation for its choice. Learning programs with magic values is difficult for existing program synthesis approaches. To overcome this limitation, we introduce an inductive logic programming approach to efficiently learn programs with magic values. Our experiments on diverse domains, including program synthesis, drug design, and game playing, show that our approach can (1) outperform existing approaches in terms of predictive accuracies and learning times, (2) learn magic values from infinite domains, such as the value of pi, and (3) scale to domains with millions of constant symbols.
暂无评论