Noisy (uncertain, missing, or inconsistent) information, typical of many real-world domains, may dramatically affect the performance of logic-based Machine Learning. Multistrategy Learning approaches have been tried t...
详细信息
ISBN:
(纸本)9783030038403;9783030038397
Noisy (uncertain, missing, or inconsistent) information, typical of many real-world domains, may dramatically affect the performance of logic-based Machine Learning. Multistrategy Learning approaches have been tried to solve this problem by coupling inductive logic programming with other kinds of inference. While uncertainty has been tackled using probabilistic approaches, and abduction has been used to deal with missing data, inconsistency is still an open problem. In the Multistrategy Learning perspective, this paper proposes to attack this latter kind of noise using (abstract) Argumentation, an inferential strategy aimed at handling conflicting information. More specifically, it defines a pre-processing operator based on abstract argumentation that can detect and remove noisy atoms from the observations before running the learning system on the polished data. Quantitative and qualitative experiments point out some strengths and weaknesses of the proposed approach, and suggest lines for future research on this topic.
One of the main issues when using inductive logic programming (ILP) in practice remain the long running times that are needed by ILP systems to induce the hypothesis. We explore the possibility of reducing the inducti...
详细信息
One of the main issues when using inductive logic programming (ILP) in practice remain the long running times that are needed by ILP systems to induce the hypothesis. We explore the possibility of reducing the induction running times of systems that use asymmetric relative minimal generalisation (ARMG) by analysing the bottom clauses of examples that serve as inputs into the generalisation operator. Using the fact that the ARMG covers all of the examples and that it is a subset of the variabilization of one of the examples, we identify literals that cannot appear in the ARMG and remove them prior to computing the generalisation. We apply this procedure to the ProGolem ILP system and test its performance on several real world data sets. The experimental results show an average speedup of compared to the base ProGolem system and compared to ProGolem extended with caching, both without a decrease in the accuracy of the produced hypotheses. We also observe that the gain from using the proposed method varies greatly, depending on the structure of the data set.
inductive logic programming (ILP) combines rule-based and statistical artificial intelligence methods, by learning a hypothesis comprising.a set of rules given background knowledge and constraints for the search space...
详细信息
inductive logic programming (ILP) combines rule-based and statistical artificial intelligence methods, by learning a hypothesis comprising.a set of rules given background knowledge and constraints for the search space. We focus on extending the XHAIL algorithm for ILP which is based on Answer Set programming and we evaluate our extensions using the Natural Language Processing application of sentence chunking. With respect to processing natural language, ILP can cater for the constant change in how we use language on a daily basis. At the same time, ILP does not require huge amounts of training examples such as other statistical methods and produces interpretable results, that means a set of rules, which can be analysed and tweaked if necessary. As contributions we extend XHAIL with (i) a pruning mechanism within the hypothesis generalisation algorithm which enables learning from larger datasets, (ii) a better usage of modern solver technology using recently developed optimisation methods, and (iii) a time budget that permits the usage of suboptimal results. We evaluate these improvements on the task of sentence chunking using three datasets from a recent SemEval competition. Results show that our improvements allow for learning on bigger datasets with results that are of similar quality to state-of-the-art Systems on the same task. Moreover, we compare the hypotheses obtained on datasets to gain insights on the structure of each dataset. (c) 2017 Elsevier Ltd. All rights reserved.
Three relevant areas of interest in symbolic Machine Learning are incremental supervised learning, multistrategy learning and predicate invention. In many real-world tasks, new observations may point out the inadequac...
详细信息
Three relevant areas of interest in symbolic Machine Learning are incremental supervised learning, multistrategy learning and predicate invention. In many real-world tasks, new observations may point out the inadequacy of the learned model. In such a case, incremental approaches allow to adjust it, instead of learning a new model from scratch. Specifically, when a negative example is wrongly classified by a model, specialization refinement operators are needed. A powerful way to specialize a theory in inductive logic programming is adding negated preconditions to concept definitions. This paper describes an empowered specialization operator that allows to introduce the negation of conjunctions of preconditions using predicate invention. An implementation of the operator is proposed, and experiments purposely devised to stress it prove that the proposed approach is correct and viable even under quite complex conditions.
This study was performed to extract rules for reducing body fat mass so as to prevent lifestyle-related diseases. Lifestyle-related diseases have been increasing in Japan, even among younger people. Body fat mass is r...
详细信息
Relation Extraction ( RE) is the task of detecting semantic relations between entities in text. Most of the state-of-the-art RE systems rely on statistical machine learning techniques which usually employ an attribute...
详细信息
ISBN:
(纸本)9781509001637
Relation Extraction ( RE) is the task of detecting semantic relations between entities in text. Most of the state-of-the-art RE systems rely on statistical machine learning techniques which usually employ an attribute-value representation of features. Contrarily to this trend, we focus on an alternative approach to RE based on the automatic induction of symbolic extraction rules. We present OntoILPER, an RE system based on inductive logic programming which uses a domain ontology in its extraction process. Several experiments are discussed in this paper over the reACE 2004/2005 reference corpora. The results are encouraging and seem to demonstrate the effectiveness of the proposed solution.
Relation Extraction (RE) is the task of detecting semantic relations between entities in text. Most of the state-of-the-art RE systems rely on statistical machine learning techniques which usually employ an attribute-...
详细信息
ISBN:
(纸本)9781509001644
Relation Extraction (RE) is the task of detecting semantic relations between entities in text. Most of the state-of-the-art RE systems rely on statistical machine learning techniques which usually employ an attribute-value representation of features. Contrarily to this trend, we focus on an alternative approach to RE based on the automatic induction of symbolic extraction rules. We present OntoILPER, an RE system based on inductive logic programming which uses a domain ontology in its extraction process. Several experiments are discussed in this paper over the reACE 2004/2005 reference corpora. The results are encouraging and seem to demonstrate the effectiveness of the proposed solution.
This paper presents four novel approaches to enhance efficiency and effectiveness of inductive logic programming (ILP) systems, along with their implementation in a new ILP system, called TWEETY. The proposed approach...
详细信息
This paper presents four novel approaches to enhance efficiency and effectiveness of inductive logic programming (ILP) systems, along with their implementation in a new ILP system, called TWEETY. The proposed approaches include (1) a new declaration mechanism, called connection declarations, for bottom clause construction, which is simpler but more expressive than the commonly used mode declarations;(2) a new covering technique, called super_covering, which reduces the examples in such a way that recursion can be learned, independently from the ordering of the examples;(3) a new search heuristics, called neg_coverage heuristics, which guides the search using only the number of negative examples covered by each hypothesis and (4) a new search algorithm, called doubly_guided_search, which searches for best clauses by alternating the use of two search heuristics, i.e. the traditional coverage search heuristics and the new neg_coverage search heuristics. The TWEETY system is shown to be more effective and efficient than the state-of-the-art ILP system ALEPH;the proposed techniques can be used to enhance efficiency and effectiveness of ALEPH and other systems based on the same ILP principles.
Pancreatic cancer is a devastating disease and predicting the status of the patients becomes an important and urgent issue. The authors explore the applicability of inductive logic programming (ILP) method in the dise...
详细信息
Pancreatic cancer is a devastating disease and predicting the status of the patients becomes an important and urgent issue. The authors explore the applicability of inductive logic programming (ILP) method in the disease and show that the accumulated clinical laboratory data can be used to predict disease characteristics, and this will contribute to the selection of therapeutic modalities of pancreatic cancer. The availability of a large amount of clinical laboratory data provides clues to aid in the knowledge discovery of diseases. In predicting the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer, using the ILP model, three rules are developed that are consistent with descriptions in the literature. The rules that are identified are useful to detect the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer and therefore contributed significantly to the decision of therapeutic strategies. In addition, the proposed method is compared with the other typical classification techniques and the results further confirm the superiority and merit of the proposed method.
Model transformation by example is a novel trend in model-driven software engineering. The rationale behind this is to utilize existing knowledge represented by source and target models of previously developed systems...
详细信息
ISBN:
(纸本)9781479940752
Model transformation by example is a novel trend in model-driven software engineering. The rationale behind this is to utilize existing knowledge represented by source and target models of previously developed systems;such as requirements analysis and software design models, respectively. Such knowledge can be utilized to derive transformation rules to be applied in future system developments. To achieve this goal, machine learning techniques can assist in discovering and formalizing desired transformation rules. inductive logic programming (ILP) represents a highly applicable machine learning technique in this context. Given a set of examples and background knowledge encoded as a set of first-order logic descriptions, an ILP system attempts to derive rules describing different transformation steps in a purely declarative way. The induced rules follow the same logical description as the given examples and background knowledge. The objective of this work is to introduce initial setup of an ILP system that can be utilized to derive analysis-design transformation rules from a set of examples that represent pairs of analysis-design models.
暂无评论