In real-life domains, learning systems often have to deal with various kinds of imperfections in data such as noise, incompleteness and inexactness. This problem seriously affects the knowledge discovery process, spec...
详细信息
In real-life domains, learning systems often have to deal with various kinds of imperfections in data such as noise, incompleteness and inexactness. This problem seriously affects the knowledge discovery process, specifically in the case of traditional Machine Learning approaches that exploit simple or constrained knowledge representations and are based on single inference mechanisms. Indeed, this limits their capability of discovering fundamental knowledge in those situations. In order to broaden the investigation and the applicability of machine learning schemes in such particular situations, it is necessary to move on to more expressive representations which require more complex inference mechanisms. However, the applicability of such new and complex inference mechanisms, such as abductive reasoning, strongly relies on a deep background knowledge about the specific application domain. This work aims at automatically discovering the meta-knowledge needed to abduction inference strategy to complete the incoming information in order to handle cases of missing knowledge.
The multiscalar architecture advocates a distributed processor organization and task-level speculation to exploit high degrees of instruction level parallelism (ILP) in sequential programs without impeding improvement...
详细信息
The multiscalar architecture advocates a distributed processor organization and task-level speculation to exploit high degrees of instruction level parallelism (ILP) in sequential programs without impeding improvements in clock speeds. The main goal of this paper is to understand the key implications of the architectural features of distributed processor organization and task-level speculation for compiler task selection from the point of view of performance. We identify the fundamental performance issues to be: control flow speculation, data communication, data dependence speculation, load imbalance, and task overhead. We show that these issues are intimately related to a few key characteristics of tasks: task size, intertask control flow, and intertask data dependence. We describe compiler heuristics to select tasks with favorable characteristics. We report experimental results to show that the heuristics are successful in boosting overall performance by establishing larger ILP windows. We also present a breakdown of execution times to show that register wait, load imbalance, control flow squash, and conventional pipeline losses are significant for almost all the SPEC95 benchmarks. (C) 1999 Academic Press.
The paper studies the learnability of Horn expressions within the framework of learning from entailment, where the goal is to exactly identify some pre-fixed and unknown expression by making queries to membership and ...
详细信息
The paper studies the learnability of Horn expressions within the framework of learning from entailment, where the goal is to exactly identify some pre-fixed and unknown expression by making queries to membership and equivalence oracles. It is shown that a class that includes both range restricted Horn expressions (where terms in the conclusion also appear in the condition of a Horn clause) and constrained Horn expressions (where terms in the condition also appear in the conclusion of a Horn clause) is learnable. This extends previous results by showing that a larger class is learnable with better complexity bounds. A further improvement in the number of queries is obtained when considering the class of Horn expressions with inequalities on all syntactically distinct terms. (C) 2002 Elsevier Science (USA).
We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three separate stages: generate, test, and co...
详细信息
We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three separate stages: generate, test, and constrain. In the generate stage, the learner generates a hypothesis (a logic program) that satisfies a set of hypothesis constraints (constraints on the syntactic form of hypotheses). In the test stage, the learner tests the hypothesis against training examples. A hypothesis fails when it does not entail all the positive examples or entails a negative example. If a hypothesis fails, then, in the constrain stage, the learner learns constraints from the failed hypothesis to prune the hypothesis space, i.e. to constrain subsequent hypothesis generation. For instance, if a hypothesis is too general (entails a negative example), the constraints prune generalisations of the hypothesis. If a hypothesis is too specific (does not entail all the positive examples), the constraints prune specialisations of the hypothesis. This loop repeats until either (i) the learner finds a hypothesis that entails all the positive and none of the negative examples, or (ii) there are no more hypotheses to test. We introduce Popper, an ILP system that implements this approach by combining answer set programming and Prolog. Popper supports infinite problem domains, reasoning about lists and numbers, learning textually minimal programs, and learning recursive programs. Our experimental results on three domains (toy game problems, robot strategies, and list transformations) show that (i) constraints drastically improve learning performance, and (ii) Popper can outperform existing ILP systems, both in terms of predictive accuracies and learning times.
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its fre...
详细信息
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
Objective: Electronic health records (EHR) offer medical and pharmacogenomics research unprecedented opportunities to identify and classify patients at risk. EHRs are collections of highly inter-dependent records that...
详细信息
Objective: Electronic health records (EHR) offer medical and pharmacogenomics research unprecedented opportunities to identify and classify patients at risk. EHRs are collections of highly inter-dependent records that include biological, anatomical, physiological, and behavioral observations. They comprise a patient's clinical phenome, where each patient has thousands of date-stamped records distributed across many relational tables. Development of EHR computer-based phenotyping algorithms require time and medical insight from clinical experts, who most often can only review a small patient subset representative of the total EHR records, to identify phenotype features. In this research we evaluate whether relational machine learning (ML) using inductive logic programming (ILP) can contribute to addressing these issues as a viable approach for EHR-based phenotyping. Methods: Two relational learning ILP approaches and three well-known WEKA (Waikato Environment for Knowledge Analysis) implementations of non-relational approaches (PART, J48, and JRIP) were used to develop models for nine phenotypes. International Classification of Diseases, Ninth Revision (ICD-9) coded EHR data were used to select training cohorts for the development of each phenotypic model. Accuracy, precision, recall, F-Measure, and Area Under the Receiver Operating Characteristic (AUROC) curve statistics were measured for each phenotypic model based on independent manually verified test cohorts. A two-sided binomial distribution test (sign test) compared the five ML approaches across phenotypes for statistical significance. Results: We developed an approach to automatically label training examples using ICD-9 diagnosis codes for the ML approaches being evaluated. Nine phenotypic models for each ML approach were evaluated, resulting in better overall model performance in AUROC using ILP when compared to PART (p = 0.039), J48 (p = 0.003) and JRIP (p = 0.003). Discussion: ILP has the potential to impro
作者:
Divina, FTilburg Univ
Computat Linguist & AI Sect NL-5000 LE Tilburg Netherlands
This paper presents an overview of evolutionary approaches to inductive logic programming (ILP). After a short description of the two popular ILP systems FOIL and Progol, we focus on methods based on evolutionary algo...
详细信息
This paper presents an overview of evolutionary approaches to inductive logic programming (ILP). After a short description of the two popular ILP systems FOIL and Progol, we focus on methods based on evolutionary algorithms (EAs). Six systems are described and compared by means of the following aspects: search strategy, representation, hypothesis evaluation, search operators and biases adopted for limiting the hypothesis space. We discuss possible advantages and drawbacks related to the specific features of the systems along these aspects. Issues concerning the relative performance and efficiency of the systems are addressed.
Fuzzy Description logics (DLs) are logics that allow to deal with structured vague knowledge. Although a relatively important amount of work has been carried out in the last years concerning the use of fuzzy DLs as on...
详细信息
Fuzzy Description logics (DLs) are logics that allow to deal with structured vague knowledge. Although a relatively important amount of work has been carried out in the last years concerning the use of fuzzy DLs as ontology languages, the problem of automatically managing the evolution of fuzzy ontologies has received very little attention so far. We describe here a logic-based computational method for the automated induction of fuzzy ontology axioms which follows the machine learning approach of inductive logic programming. The potential usefulness of the method is illustrated by means of an example taken from the tourism application domain.
Learning from interpretation transition (LFIT) automatically constructs a model of the dynamics of a system from the observation of its state transitions. So far the systems that LFIT handled were mainly restricted to...
详细信息
Learning from interpretation transition (LFIT) automatically constructs a model of the dynamics of a system from the observation of its state transitions. So far the systems that LFIT handled were mainly restricted to synchronous deterministic dynamics. However, other dynamics exist in the field of logical modeling, in particular the asynchronous semantics which is widely used to model biological systems. In this paper, we propose a modeling of discrete memory-less multi-valued dynamic systems as logic programs in which a rule represents what can occur rather than what will occur. This modeling allows us to represent non-determinism and to propose an extension of LFIT to learn regardless of the update schemes, allowing to capture a large range of semantics. We also propose a second algorithm which is able to learn a whole system dynamics, including its semantics, in the form of a single propositional logic program with constraints. We show through theoretical results the correctness of our approaches. Practical evaluation is performed on benchmarks from biological literature.
The configurable nature of field-programmable gate arrays (FPGAs) has allowed designers to take advantage of various data flow characteristics in application kernels to create custom architecture implementations, by o...
详细信息
The configurable nature of field-programmable gate arrays (FPGAs) has allowed designers to take advantage of various data flow characteristics in application kernels to create custom architecture implementations, by optimising instruction level paralleism (ILP) and pipelining at the register transfer level. However, not all applications are composed of pure data flow kernels. Intermingling of control and data flows in applications offers more interesting challenges in creating custom architectures. The authors present one possible way to take advantage of correlations that may be present among data flow graphs (DFGs) embedded in control flow graphs. In certain cases, where there is sufficient correlation and ILP, the proposed context adaptable architecture (CAA) design methodology results in an interesting and useful custom architecture for such embedded DFGs. Certain other application characteristics may demand the use of alternative methodologies such as partial and dynamic reconfiguration (PDR) and a mixture of PDR and common sub-graph methods (PDR-CSG). The authors present a rigorous analysis, combined with some benchmarking efforts to showcase the differences, advantages and disadvantages of the CAA methodology with other methodologies. The authors also present an analysis of how the core underlying algorithm in our methodology compares with other published algorithms and the differences in resulting designs on an FPGA for a sample set of test cases.
暂无评论