This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its fre...
详细信息
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
Objective: Electronic health records (EHR) offer medical and pharmacogenomics research unprecedented opportunities to identify and classify patients at risk. EHRs are collections of highly inter-dependent records that...
详细信息
Objective: Electronic health records (EHR) offer medical and pharmacogenomics research unprecedented opportunities to identify and classify patients at risk. EHRs are collections of highly inter-dependent records that include biological, anatomical, physiological, and behavioral observations. They comprise a patient's clinical phenome, where each patient has thousands of date-stamped records distributed across many relational tables. Development of EHR computer-based phenotyping algorithms require time and medical insight from clinical experts, who most often can only review a small patient subset representative of the total EHR records, to identify phenotype features. In this research we evaluate whether relational machine learning (ML) using inductive logic programming (ILP) can contribute to addressing these issues as a viable approach for EHR-based phenotyping. Methods: Two relational learning ILP approaches and three well-known WEKA (Waikato Environment for Knowledge Analysis) implementations of non-relational approaches (PART, J48, and JRIP) were used to develop models for nine phenotypes. International Classification of Diseases, Ninth Revision (ICD-9) coded EHR data were used to select training cohorts for the development of each phenotypic model. Accuracy, precision, recall, F-Measure, and Area Under the Receiver Operating Characteristic (AUROC) curve statistics were measured for each phenotypic model based on independent manually verified test cohorts. A two-sided binomial distribution test (sign test) compared the five ML approaches across phenotypes for statistical significance. Results: We developed an approach to automatically label training examples using ICD-9 diagnosis codes for the ML approaches being evaluated. Nine phenotypic models for each ML approach were evaluated, resulting in better overall model performance in AUROC using ILP when compared to PART (p = 0.039), J48 (p = 0.003) and JRIP (p = 0.003). Discussion: ILP has the potential to impro
作者:
Divina, FTilburg Univ
Computat Linguist & AI Sect NL-5000 LE Tilburg Netherlands
This paper presents an overview of evolutionary approaches to inductive logic programming (ILP). After a short description of the two popular ILP systems FOIL and Progol, we focus on methods based on evolutionary algo...
详细信息
This paper presents an overview of evolutionary approaches to inductive logic programming (ILP). After a short description of the two popular ILP systems FOIL and Progol, we focus on methods based on evolutionary algorithms (EAs). Six systems are described and compared by means of the following aspects: search strategy, representation, hypothesis evaluation, search operators and biases adopted for limiting the hypothesis space. We discuss possible advantages and drawbacks related to the specific features of the systems along these aspects. Issues concerning the relative performance and efficiency of the systems are addressed.
Fuzzy Description logics (DLs) are logics that allow to deal with structured vague knowledge. Although a relatively important amount of work has been carried out in the last years concerning the use of fuzzy DLs as on...
详细信息
Fuzzy Description logics (DLs) are logics that allow to deal with structured vague knowledge. Although a relatively important amount of work has been carried out in the last years concerning the use of fuzzy DLs as ontology languages, the problem of automatically managing the evolution of fuzzy ontologies has received very little attention so far. We describe here a logic-based computational method for the automated induction of fuzzy ontology axioms which follows the machine learning approach of inductive logic programming. The potential usefulness of the method is illustrated by means of an example taken from the tourism application domain.
One of the most important issues in instruction-level parallelism (ILP) processors involves the boosting of instructions across conditional branches for speculative execution. A compiler scheduling technique named LES...
详细信息
One of the most important issues in instruction-level parallelism (ILP) processors involves the boosting of instructions across conditional branches for speculative execution. A compiler scheduling technique named LESS with a renaming function is proposed for the elimination of hazards that incorrectly overwrite a value when the branch is incorrectly predicted during speculative execution. The hardware implementation for this method is relatively simple and rather efficient. Simulation results show that the speedups achieved by LESS are better than other existing methods. For example, under the superscalar execution model, with an issue rate of 8, the average performance improvement by LESS can be expected to be 13% better than that of the CRF scheme, a solution reported recently with a scheduling skeleton similar to LESS.
Learning from interpretation transition (LFIT) automatically constructs a model of the dynamics of a system from the observation of its state transitions. So far the systems that LFIT handled were mainly restricted to...
详细信息
Learning from interpretation transition (LFIT) automatically constructs a model of the dynamics of a system from the observation of its state transitions. So far the systems that LFIT handled were mainly restricted to synchronous deterministic dynamics. However, other dynamics exist in the field of logical modeling, in particular the asynchronous semantics which is widely used to model biological systems. In this paper, we propose a modeling of discrete memory-less multi-valued dynamic systems as logic programs in which a rule represents what can occur rather than what will occur. This modeling allows us to represent non-determinism and to propose an extension of LFIT to learn regardless of the update schemes, allowing to capture a large range of semantics. We also propose a second algorithm which is able to learn a whole system dynamics, including its semantics, in the form of a single propositional logic program with constraints. We show through theoretical results the correctness of our approaches. Practical evaluation is performed on benchmarks from biological literature.
The management of business processes can support efficiency improvements in organizations. One of the most interesting problems is the mining and representation of process models in a declarative language. Various rec...
详细信息
The management of business processes can support efficiency improvements in organizations. One of the most interesting problems is the mining and representation of process models in a declarative language. Various recently proposed knowledge-based languages showed advantages over graph-based procedural notations. Moreover, rapid changes of the environment require organizations to check how compliant are new process instances with the deployed models. We present a Statistical Relational Learning approach to Workflow Mining that takes into account both flexibility and uncertainty in real environments. It performs automatic discovery of process models expressed in a probabilistic logic. It uses the existing DPML algorithm for extracting first-order logic constraints from process logs. The constraints are then translated into Markov logic to learn their weights. Inference on the resulting Markov logic model allows a probabilistic classification of test traces, by assigning them the probability of being compliant to the model. We applied this approach to three datasets and compared it with DPML alone, five Petri net- and EPC-based process mining algorithms and Tilde. The technique is able to better classify new execution traces, showing higher accuracy and areas under the PR/ROC curves in most cases.
The configurable nature of field-programmable gate arrays (FPGAs) has allowed designers to take advantage of various data flow characteristics in application kernels to create custom architecture implementations, by o...
详细信息
The configurable nature of field-programmable gate arrays (FPGAs) has allowed designers to take advantage of various data flow characteristics in application kernels to create custom architecture implementations, by optimising instruction level paralleism (ILP) and pipelining at the register transfer level. However, not all applications are composed of pure data flow kernels. Intermingling of control and data flows in applications offers more interesting challenges in creating custom architectures. The authors present one possible way to take advantage of correlations that may be present among data flow graphs (DFGs) embedded in control flow graphs. In certain cases, where there is sufficient correlation and ILP, the proposed context adaptable architecture (CAA) design methodology results in an interesting and useful custom architecture for such embedded DFGs. Certain other application characteristics may demand the use of alternative methodologies such as partial and dynamic reconfiguration (PDR) and a mixture of PDR and common sub-graph methods (PDR-CSG). The authors present a rigorous analysis, combined with some benchmarking efforts to showcase the differences, advantages and disadvantages of the CAA methodology with other methodologies. The authors also present an analysis of how the core underlying algorithm in our methodology compares with other published algorithms and the differences in resulting designs on an FPGA for a sample set of test cases.
The clausal discovery engine CLAUDIEN is presented. CLAUDIEN is an inductive logic programming engine that fits in the descriptive data mining paradigm. CLAUDIEN addresses characteristic induction from interpretations...
详细信息
The clausal discovery engine CLAUDIEN is presented. CLAUDIEN is an inductive logic programming engine that fits in the descriptive data mining paradigm. CLAUDIEN addresses characteristic induction from interpretations, a task which is related to existing formalisations of induction in logic. In characteristic induction from interpretations, the regularities are represented by clausal theories, and the data using Herbrand interpretations. Because CLAUDIEN uses clausal logic to represent hypotheses, the regularities induced typically involve multiple relations or predicates. CLAUDIEN also employs a novel declarative bias mechanism to define the set of clauses that may appear in a hypothesis.
An important ingredient in agent-mediated electronic commerce is the presence of intelligent mediating agents that assist electronic commerce participants (e.g. individual users, other agents, organisations). These me...
详细信息
An important ingredient in agent-mediated electronic commerce is the presence of intelligent mediating agents that assist electronic commerce participants (e.g. individual users, other agents, organisations). These mediating agents are in principle autonomous agents that interact with their environments (e.g. other agents and web-servers) oil behalf of participants who have delegated tasks to them. For mediating agents a (preference) model of participants is indispensable. In this paper, a generic mediating agent architecture is introduced. Furthermore, we discuss our view of user preference modelling and its need in agent-mediated electronic commerce. We survey the state of the art in the field of preference modelling and suggest that the preferences of electronic commerce participants can be modelled by learning from their behaviour. In particular, we employ an existing machine learning method called inductive logic programming (ILP). We argue that this method can be used by mediating agents to detect regularities in the behaviour of the involved participants and induce hypotheses about their preferences automatically. Finally, we discuss some advantages and disadvantages of using inductive logic programming as a method for learning user preferences and compare this method with other approaches. (c) 2005 Elsevier B.V. All rights reserved.
暂无评论