作者:
Yamamoto, AHokkaido Univ
Div Elect & Informat Engn Kita Ku Sapporo Hokkaido 0608628 Japan Hokkaido Univ
Meme Media Lab Kita Ku Sapporo Hokkaido 0608628 Japan
In this paper we revise Muggleton's theory of inverse entailment, which is the logical foundation of Progol, one of the most famous ILP systems. We first point out that the theory is incomplete in general. Secondl...
详细信息
In this paper we revise Muggleton's theory of inverse entailment, which is the logical foundation of Progol, one of the most famous ILP systems. We first point out that the theory is incomplete in general. Secondly we prove that the theory is complete if the background knowledge given to the system is a ground reduced program, every training example is a ground unit clause, and the hypothesis space is the set of all definite clauses. The proof is obtained by showing that every ground reduced logic program is logically equivalent to the conjunction of all atoms in its least Herbrand model. As a corollary to this equivalence, we are finally able to improve the logical foundation of the GOLEM system.
Instance based learning and clustering are popular methods in propositional machine learning. Both methods use a notion of similarity between objects. This dissertation investigates these methods in a relational setti...
详细信息
Instance based learning and clustering are popular methods in propositional machine learning. Both methods use a notion of similarity between objects. This dissertation investigates these methods in a relational setting. First, a number of new metrics are proposed. Next, these metrics are used to upgrade clustering and instance based learning to first order logic.
The bounded ILP-consistency problem for function-free Horn clauses is described as follows. Given a set E+ and E- of function-free ground Horn clauses and an integer I;polynomial in E+ boolean OR E-, does there exist ...
详细信息
The bounded ILP-consistency problem for function-free Horn clauses is described as follows. Given a set E+ and E- of function-free ground Horn clauses and an integer I;polynomial in E+ boolean OR E-, does there exist a function-free Horn clause C with no more than k literals such that C subsumes each element in E+ and C does not subsume any element in E-? It is shown that this problem is Sigma(2)(P) complete. We derive some related results on the complexity of ILP and discuss the usefulness of such complexity results.
Conception, design, and implementation of cDNA microarray experiments present a variety of bioinformatics challenges for biologists and computational scientists. The multiple stages of data acquisition and analysis ha...
详细信息
Conception, design, and implementation of cDNA microarray experiments present a variety of bioinformatics challenges for biologists and computational scientists. The multiple stages of data acquisition and analysis have motivated the design of Expresso, a system for microarray experiment management. Salient aspects of Expresso include support for clone replication and randomized placement;automatic gridding, extraction of expression data from each spot, and quality monitoring;flexible methods of combining data from individual spots into information about clones and functional categories;and the use of inductive logic programming for higher-level data analysis and mining. The development of Expresso is occurring in parallel with several generations of microarray experiments aimed at elucidating genomic responses to drought stress in loblolly pine seedlings. The current experimental design incorporates 384 pine cDNAs replicated and randomly placed in two specific microarray layouts. We describe the design of Expresso as well as results of analysis with Expresso that suggest the importance of molecular chaperones and membrane transport proteins in mechanisms conferring successful adaptation to long-term drought stress. Copyright (C) 2002 John Wiley Sons, Ltd.
Isolated limb perfusion (ILP) is a well-established locoregional procedure todeliver high doses of cytostatics to an extremity with multiple in-transit lesions from cutaneousmelanoma, with minimal systemic and mild lo...
详细信息
Isolated limb perfusion (ILP) is a well-established locoregional procedure todeliver high doses of cytostatics to an extremity with multiple in-transit lesions from cutaneousmelanoma, with minimal systemic and mild local toxicity. This approach is quite sophisticated andrequires accurate monitoring of systemic leakage and of the temperature of the affected limb inorder to avoid major systemic and local side effects. Mephalan (L-PAM) is considered the referencedrug, although complete responses are reported in only about 50% of patients. Since the early 1990s,tumor necrosis factor-alpha (TNF-alpha) was administered with melphalan in ILP aiming to improvethe therapeutic index of this procedure. However, despite the impressive results reported, its rolestill remains controversial, seemingly confined to large tumor bulk. Fotemustine ILP was proposed asa less toxic alternative to L-PAM, after the results of a pilot experience claiming similarresponse rates with less local toxicity. A formal phase 1-2 study is now underway to confirm thesefindings. More straightforward procedures, such as isolated limb infusion, are appealing, as theyseem capable of achieving good response rates, are easily repeatable, and are less costly. Largerseries are required to validate such results. As potential agents to be delivered through ILP, newvasoactive drugs and agents with new mechanisms of action that interplay with chemotherapy, as wellas virus-mediated gene therapy, are being developed.
When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. inductive logic programming techniques are typically more express...
详细信息
When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently. Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting). As a case study, we present two alternative implementations of the ILP system TILDE (Top-down Induction of logical DEcision trees): TILDEclassic, which loads all data in main memory, and TILDELDS, which loads the examples one by one. We experimentally compare the implementations, showing TILDELDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.
The gRS-ILP model (generic Rough Set inductive logic programming model) provides a framework for inductive logic programming when the setting is imprecise and any induced logic program will not be able to distinguish ...
详细信息
ISBN:
(纸本)3540666451
The gRS-ILP model (generic Rough Set inductive logic programming model) provides a framework for inductive logic programming when the setting is imprecise and any induced logic program will not be able to distinguish between certain positive and negative examples. However, in this rough setting, where it is inherently not possible to describe the entire data with 100% accuracy, it is possible to definitively describe part of the data with 100% accuracy. The gRS-ILP model is extended in this paper to motifs in strings. An illustrative experiment is presented using the ILP system Progol on transmembrane domains in amino acid sequences.
In this paper we present a new method that uses data-flow coherence constraints in definite logic program generation. We outline three main advantages of these constraints supported by our results: i) drastically prun...
详细信息
In this paper we present a new method that uses data-flow coherence constraints in definite logic program generation. We outline three main advantages of these constraints supported by our results: i) drastically pruning the search space (around 90%), ii) reducing the set of positive examples and reducing or even removing the need for the set of negative examples, and iii) allowing the induction of predicates that are difficult or even impossible to generate by other methods. Besides these constraints, the approach takes into consideration the program termination condition for recursive predicates. The paper outlines some theoretical issues and implementation aspects of our system for automatic logic program induction.
Web mining refers to the process of discovering potentially useful and previously unknown information or knowledge from web data. A graph-based framework is used for classifying Web users based on their navigation pat...
详细信息
Web mining refers to the process of discovering potentially useful and previously unknown information or knowledge from web data. A graph-based framework is used for classifying Web users based on their navigation patterns. GOLEM is a learning algorithm that uses the example space to restrict the solution search space. In this paper, this algorithm is modified for the graph-based framework. GOLEM is appropriate in this application where the solution search space is very large. An experimental illustration is presented.
Background: The inference of homology between proteins is a key problem in molecular biology The current best approaches only identify similar to50% of homologies (with a false positive rate set at I/ 1000). Results: ...
详细信息
Background: The inference of homology between proteins is a key problem in molecular biology The current best approaches only identify similar to50% of homologies (with a false positive rate set at I/ 1000). Results: We present Homology Induction (HI), a new approach to inferring homology. HI uses machine learning to bootstrap from standard sequence similarity search methods. First a standard method is run, then HI learns rules which are true for sequences of high similarity to the target (assumed homologues) and not true for general sequences, these rules are then used to discriminate sequences in the twilight zone. To learn the rules HI describes the sequences in a novel way based on a bioinformatic knowledge base, and the machine learning method of inductive logic programming. To evaluate HI we used the PDB40D benchmark which lists sequences of known homology but low sequence similarity. We compared the H I methodoly with PSI-BLAST alone and found HI performed significantly better. In addition, Receiver Operating Characteristic (ROC) curve analysis showed that these improvements were robust for all reasonable error costs. The predictive homology rules learnt by HI by can be interpreted biologically to provide insight into conserved features of homologous protein families. Conclusions: HI is a new technique for the detection of remote protein homolgy - a central bioinformatic problem. HI with PSI-BLAST is shown to outperform PSI-BLAST for all error costs. It is expect that similar improvements would be obtained using HI with any sequence similarity method.
暂无评论