An attempt is made to determine how machine learning benefit from using a richer and more expressive representation. The approach taken is to compare the performance of a so-called upgrade of a propositional algorithm...
详细信息
An attempt is made to determine how machine learning benefit from using a richer and more expressive representation. The approach taken is to compare the performance of a so-called upgrade of a propositional algorithm and the performance of a propositional learning algorithm applied to so-called propositionalization. This comparison is performed in learning domains that are essentially relational.
Background: Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all posi...
详细信息
Background: Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physicochemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, inductive logic programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM). Results: We use the SCOP database to perform our experiments by evaluating protein recognition within the same superfamily. Our results show that our methodology when using SVM performs significantly better than some of the state of the art methods, and comparable to other. However, our method provides a comprehensible set of logical rules that can help to understand what determines a protein function. Conclusions: The strategy of selecting only the most frequent patterns is effective for the remote homology detection. This is possible through a suitable first-order logical representation of homologous properties, and through a set of frequent patterns, found by an ILP system, that summarizes essential features of protein functions.
Relation Extraction (RE) is the task of detecting semantic relations between entities in text. Most of the state-of-the-art RE systems rely on statistical machine learning techniques which usually employ an attribute-...
详细信息
ISBN:
(纸本)9781509001644
Relation Extraction (RE) is the task of detecting semantic relations between entities in text. Most of the state-of-the-art RE systems rely on statistical machine learning techniques which usually employ an attribute-value representation of features. Contrarily to this trend, we focus on an alternative approach to RE based on the automatic induction of symbolic extraction rules. We present OntoILPER, an RE system based on inductive logic programming which uses a domain ontology in its extraction process. Several experiments are discussed in this paper over the reACE 2004/2005 reference corpora. The results are encouraging and seem to demonstrate the effectiveness of the proposed solution.
Various ways of abstraction in reinforcement learning methods have been proposed. The central idea is to make use of the inherent structure in the MDP itself. Most traditional techniques do not scale up to even larger...
详细信息
Various ways of abstraction in reinforcement learning methods have been proposed. The central idea is to make use of the inherent structure in the MDP itself. Most traditional techniques do not scale up to even larger domains consisting of objects and relations. We present a proposal for abstract model building to construct relational Markov decision process. This approach separates the structural induction of the representation from the actual value function estimation. First a set of first-order features is induced utilizing inductive logic programming. These are then used as input for a regression algorithm that estimates Q-value functions per action in the induced states and determine a policy. In this way we hope to improve performance of standard Q-learaing.
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research,especially for the development of diagnostics and effective therapeutic *** this work,we propose ...
详细信息
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research,especially for the development of diagnostics and effective therapeutic *** this work,we propose a methodology using inductive logic programming(ILP) to automatically extract knowledge about deleterious/neutral mutations from a multi-relational database,named *** used 8117 mutations in 805 proteins with known three-dimensional structure in our *** using ILP for learning,we obtained classification rules that can be interpreted by a human expert and that help to improve our understanding of the relationships between physico-chemical and evolutionary features and deleterious *** experimental results,compared with state-of-the-art methods,show that the proposed approach can be applied to predict the impact of single amino acid replacement on the function of a *** rules and the estimated effect of human non-synonymous polymorphisms on the function of a protein are available at http://***/sm2ph/***.
Rule based machine translation systems face different challenges in building the translation model in a form of transfer rules. Some of these problems require enormous human effort to state rules and their consistency...
详细信息
Rule based machine translation systems face different challenges in building the translation model in a form of transfer rules. Some of these problems require enormous human effort to state rules and their consistency. This is where different human linguists make different rules for the same sentence. A human linguist states rules to be understood by human rather than machines. The proposed translation model (from Arabic to English) tackles the mentioned problem of building translation model. This model employs inductive logic programming (ILP) to learn the language model from a set of example pairs acquired from parallel corpora and represent the language model in a rule-based format that maps Arabic sentence pattern to English sentence pattern. By testing the model on a small set of data, it generated translation rules with logarithmic growing rate and with word error rate 11%
We present NrSample, a framework for program synthesis in inductive logic programming. NrSample uses propositional logic constraints to exclude undesirable candidates from the search. This is achieved by representing ...
详细信息
We present NrSample, a framework for program synthesis in inductive logic programming. NrSample uses propositional logic constraints to exclude undesirable candidates from the search. This is achieved by representing constraints as propositional formulae and solving the associated constraint satisfaction problem. We present a variety of such constraints: pruning, input-output, functional (arithmetic), and variable splitting. NrSample is also capable of detecting search space exhaustion, leading to further speedups in clause induction and optimality. We benchmark NrSample against enumeration search (Aleph's default) and Progol's A* search in the context of program synthesis. The results show that, on large program synthesis problems, NrSample induces between 1 and 1358 times faster than enumeration (236 times faster on average), always with similar or better accuracy. Compared to Progol A*, NrSample is 18 times faster on average with similar or better accuracy except for two problems: one in which Progol A* substantially sacrificed accuracy to induce faster, and one in which Progol A* was a clear winner. Functional constraints provide a speedup of up to 53 times (21 times on average) with similar or better accuracy. We also benchmark using a few concept learning (non-program synthesis) problems. The results indicate that without strong constraints, the overhead of solving constraints is not compensated for.
This paper provides a logical framework for comparing inductive capabilities among agents having different background theories. A background theory is called inductively equivalent to another background theory if the ...
详细信息
This paper provides a logical framework for comparing inductive capabilities among agents having different background theories. A background theory is called inductively equivalent to another background theory if the two theories induce the same hypotheses for any observation. Conditions of inductive equivalence change depending on the logic of representation languages and the logic of induction or inductive logic programming (ILP). In this paper, we consider clausal logic and nonmonotonic logic programs as representation languages for background theories. Then we investigate conditions of inductive equivalence in four different frameworks of induction, cautious induction , brave induction , learning from satisfiability , and descriptive induction . We observe that several induction algorithms in Horn ILP systems require weaker conditions of equivalence under restricted problem settings. We address that inductive equivalence can be used for verification and evaluation of induction algorithms, and argue problems for optimizing background theories in ILP.
Sub-symbolic Machine Learning (ML) techniques, and specifically Neural Network-based ones, recently took over the research landscape, thanks to their efficiency and impressive effectiveness. On the other hand, the rec...
详细信息
Sub-symbolic Machine Learning (ML) techniques, and specifically Neural Network-based ones, recently took over the research landscape, thanks to their efficiency and impressive effectiveness. On the other hand, the recent debate on ethics and AI and the first regulations on AI are progressively calling for anthropocentricity, which in turn requires explicit, human-understandable, and explainable approaches and representations that allow humans to be active parts in the loop. In these cases, logic-based approaches are more suitable. The inductive logic programming (ILP) branch of research in ML provides an anwer to this need and a uniform and unifying framework for three relevant industrial and research concerns: management of databases, implementation of software systems, and modeling of human-like reasoning strategies. A particular ILP framework based on the Object Identity (OI) assumption was proposed in the 1990s, for which desirable theoretical and pratical properties were demonstrated and working tools and systems that successfully approached real-world and classical problems in AI were developed. In an age when mainstream research and media seem to reduce AI and ML to just deep learning, this paper celebrates the 30th anniversary of OI by providing for the first time a comprehensive overview of the framework to be used as a reference for researchers still interested in investigating the ILP approach to ML.
The increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilit...
详细信息
The increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilities, possibly based on formal methods as logics for the definition of task specifications. However, prior knowledge is often unavailable in complex realistic scenarios. In this paper, we propose an offline algorithm based on inductive logic programming from noisy examples to extract task specifications (i.e., action preconditions, constraints and effects) directly from raw data of few heterogeneous (i.e., not repetitive) robotic executions. Our algorithm leverages on the output of any unsupervised action identification algorithm from video-kinematic recordings. Combining it with the definition of very basic, almost task-agnostic, commonsense concepts about the environment, which contribute to the interpretability of our methodology, we are able to learn logical axioms encoding preconditions of actions, as well as their effects in the event calculus paradigm. Since the quality of learned specifications depends mainly on the accuracy of the action identification algorithm, we also propose an online framework for incremental refinement of task knowledge from user's feedback, guaranteeing safe execution. Results in a standard manipulation task and benchmark for user training in the safety-critical surgical robotic scenario, show the robustness, data- and time-efficiency of our methodology, with promising results towards the scalability in more complex domains.
暂无评论