检索结果-内蒙古大学图书馆

DATA FLOW COHERENCE CONSTRAINTS FOR PRUNING THE SEARCH SPACE IN ILP TOOLS

International Journal on Artificial Intelligence Tools 2002年第2期11卷 203-218页

作者： SMARANDA MURESAN TUDOR MURESAN RODICA POTOLEA Department of Computer Science Columbia University New York USA Department of Computer Science Technical University of Cluj-Napoca Romania

In this paper we present a new method that uses data-flow coherence constraints in definite logic program generation. We outline three main advantages of these constraints supported by our results: i) drastically pruning the search space (around 90%), ii) reducing the set of positive examples and reducing or even removing the need for the set of negative examples, and iii) allowing the induction of predicates that are difficult or even impossible to generate by other methods. Besides these constraints, the approach takes into consideration the program termination condition for recursive predicates. The paper outlines some theoretical issues and implementation aspects of our system for automatic logic program induction.

关键词： inductive logic programming automatic program generation data flow coherence constraints implementation

来源：评论

学校读者我要写书评

暂无评论

MINING WEB USAGE GRAPHS USING EXAMPLE SEARCH SPACE

引用

International Journal of Computational Intelligence and Applications 2002年第2期2卷 209-220页

作者： V. UMA MAHESWARI A. SIROMONEY K. M. MEHATA School of Computer Science and Engineering Anna University Chennai 600 025 India

Web mining refers to the process of discovering potentially useful and previously unknown information or knowledge from web data. A graph-based framework is used for classifying Web users based on their navigation patterns. GOLEM is a learning algorithm that uses the example space to restrict the solution search space. In this paper, this algorithm is modified for the graph-based framework. GOLEM is appropriate in this application where the solution search space is very large. An experimental illustration is presented.

关键词： Knowledge discovery from data data mining inductive logic programming GOLEM world wide web web usage graphs web mining

来源：评论

学校读者我要写书评

暂无评论

Studying the functional genomics of stress responses in loblolly pine with the Expresso microarray experiment management system

引用

COMPARATIVE AND FUNCTIONAL GENOMICS 2002年第3期3卷 226-243页

作者： Heath, LS Ramakrishnan, N Sederoff, RR Whetten, RW Chevone, BI Struble, CA Jouenne, VY Chen, DW van Zyl, L Grene, R Virginia Tech Dept Comp Sci Blacksburg VA 24061 USA N Carolina State Univ Coll Nat Resources Raleigh NC 27695 USA Virginia Tech Dept Plant Pathol Physiol & Weed Sci Blacksburg VA 24061 USA Marquette Univ Dept Math Stat & Comp Sci Milwaukee WI 53201 USA

Conception, design, and implementation of cDNA microarray experiments present a variety of bioinformatics challenges for biologists and computational scientists. The multiple stages of data acquisition and analysis have motivated the design of Expresso, a system for microarray experiment management. Salient aspects of Expresso include support for clone replication and randomized placement;automatic gridding, extraction of expression data from each spot, and quality monitoring;flexible methods of combining data from individual spots into information about clones and functional categories;and the use of inductive logic programming for higher-level data analysis and mining. The development of Expresso is occurring in parallel with several generations of microarray experiments aimed at elucidating genomic responses to drought stress in loblolly pine seedlings. The current experimental design incorporates 384 pine cDNAs replicated and randomly placed in two specific microarray layouts. We describe the design of Expresso as well as results of analysis with Expresso that suggest the importance of molecular chaperones and membrane transport proteins in mechanisms conferring successful adaptation to long-term drought stress. Copyright (C) 2002 John Wiley Sons, Ltd.

关键词： microarrays data mining experiment management systems inductive logic programming reactive oxygen species drought stress Pinus taeda microarray design

来源：评论

学校读者我要写书评

暂无评论

Automating the analysis of option pricing algorithms through intelligent knowledge acquisition approaches

引用

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS 2001年第6期31卷 573-586页

作者： Verykios, VS Houstis, EN Tsoukala, LH Pantazopoulos, KN Drexel Univ Coll Informat Sci & Technol Philadelphia PA 19104 USA Purdue Univ Dept Comp Sci W Lafayette IN 47907 USA Purdue Univ Sch Nucl Engn W Lafayette IN 47907 USA Goldman Sachs Fixed Income Markets Div London England

The traditional approach for estimating the performance of numerical methods is to combine an operation's count with an asymptotic error analysis. This analytic approach gives a general feel of the comparative efficiency of methods, but it rarely leads to very precise results. It is now recognized that accurate performance evaluation can be made only with actual measurements on working software. Given that such an approach requires an enormous amount of performance data related to actual measurements, the development of novel approaches and systems that intelligently and efficiently analyze these data is of great importance to scientists and engineers. This paper presents new intelligent knowledge acquisition approaches and an integrated prototype system, which enables the automatic and systematic analysis of performance data. The system analyzes the performance data which is usually stored in a database with statistical, and inductive learning techniques and generates knowledge which can be incorporated in a knowledge base incrementally. We demonstrate the use of the system in the context of a case study, covering the analysis of numerical algorithms for the pricing of American vanilla options in a Black and Scholes modeling framework. We also present a qualitative and quantitative comparison of two techniques used for the automated knowledge acquisition phase. Although the system is presented with a particular pricing library in mind, the analysis and evaluation methodology can be used to study algorithms available from other libraries, as long as, these libraries can provide the necessary performance data.

关键词： data mining decision trees inductive logic programming

来源：评论

学校读者我要写书评

暂无评论

Prediction of ordinal classes using regression trees

引用

FUNDAMENTA INFORMATICAE 2001年第1-2期47卷 1-13页

作者： Kramer, S Pfahringer, B Widmer, G De Groeve, M Univ Freiburg Inst Comp Sci D-79110 Freiburg Germany Univ Waikato Dept Comp Sci Hamilton New Zealand Univ Vienna Dept Med Cybernet & AI A-1010 Vienna Austria Austrian Res Inst Artificial Intelligence A-1010 Vienna Austria Katholieke Univ Leuven Dept Comp Sci Louvain Belgium

This paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes using classification and regression trees. We start with S-CART, a tree induction algorithm, and study various ways of transforming it into a learner for ordinal classification tasks. These algorithm variants are compared on a number of benchmark data sets to verify the relative strengths and weaknesses of the strategies and to study the trade-off between optimal categorical classification accuracy (hit rate) and minimum distance-based error. Preliminary results indicate that this is a promising avenue towards algorithms that combine aspects of classification and regression.

关键词： machine learning ordinal classes regression trees decision trees classification regression inductive logic programming

来源：评论

学校读者我要写书评

暂无评论

Confirmation-guided discovery of first-order rules with Tertius

引用

MACHINE LEARNING 2001年第1-2期42卷 61-95页

作者： Flach, PA Lachiche, N Univ Bristol Dept Comp Sci Bristol BS8 1TH Avon England

This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal best-first search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal either with individual-based representations by upgrading propositional representations to first-order, or with general logical rules. We describe a number of experiments demonstrating the feasibility and flexibility of our approach.

关键词： first-order inductive learning inductive logic programming knowledge discovery

来源：评论

学校读者我要写书评

暂无评论

The effect of relational background knowledge on learning of protein three-dimensional fold signatures

引用

MACHINE LEARNING 2001年第1-2期43卷 81-95页

作者： Turcotte, M Muggleton, SH Sternberg, MJE Imperial Canc Res Fund Biomolec Modelling Lab London WC2A 3PX England Univ York Dept Comp Sci York YO1 5DD N Yorkshire England

As a form of Machine Learning the study of inductive logic programming (ILP) is motivated by a central belief: relational description languages are better tin terms of accuracy and understandability) than propositional ones for certain real-world applications. This claim is investigated here for a particular application in structural molecular biology, that of constructing readable descriptions of the major protein folds. To the authors' knowledge Machine Learning has not previously been applied systematically to this task. In this application, the domain expert (third author) identified a natural divide between essentially propositional features and more structurally-oriented relational ones. The following null hypotheses are tested: 1) for a given ILP system (Progol) provision of relational background knowledge does not increase predictive accuracy, 2) a good propositional learning system (C5.0) without relational background knowledge will outperform Progol with relational background knowledge, 3) relational background knowledge does not produce improved explanatory insight. Null hypotheses 1) and 2) are both refuted on cross-validation results carried out over 20 of the most populated protein folds. Hypothesis 3 is refuted by demonstration of various insightful rules discovered only in the relationally-oriented learned rules.

关键词： inductive logic programming scientific discovery protein fold

来源：评论

学校读者我要写书评

暂无评论

Relational instance-based learning with lists and terms

引用

MACHINE LEARNING 2001年第1-2期43卷 53-80页

作者： Horváth, T Wrobel, S Bohnebeck, U Univ Magdeburg Sch Comp Sci IWS D-39106 Magdeburg Germany Univ Bremen Ctr Comp Technol D-28834 Bremen Germany

The similarity measures used in first-order IBL so far have been limited to the function-free case. In this paper we show that a lot of power can be gained by allowing lists and other terms in the input representation and designing similarity measures that work directly on these structures. We present an improved similarity measure for the first-order instance-based learner RIBL that employs the concept of edit distances to efficiently compute distances between lists and terms, discuss its computational and formal properties, and empirically demonstrate its additional power on a problem from the domain of biochemistry. The paper also includes a thorough reconstruction of RIBL'S overall algorithm.

关键词： inductive logic programming relational instance-based learning

来源：评论

学校读者我要写书评

暂无评论

Approximate match of rules using backpropagation neural networks

引用

MACHINE LEARNING 2001年第3期44卷 273-299页

作者： Kijsirikul, B Sinthupinyo, S Chongkasemwongse, K Chulalongkorn Univ Dept Comp Engn Bangkok 10330 Thailand

This paper presents a method for approximate match of first-order rules with unseen data. The method is useful especially in case of a multi-class problem or a noisy domain where unseen data are often not covered by the rules. Our method employs the Backpropagation Neural Network for the approximation. To build the network, we propose a technique for generating features from the rules to be used as inputs to the network. Our method has been evaluated on four domains of first-order learning problems. The experimental results show improvements of our method over the use of the original rules. We also applied our method to approximate match of propositional rules converted from an unpruned decision tree. In this case, our method can be thought of as soft-pruning of the decision tree. The results on multi-class learning domains in the UCI repository of machine learning databases show that our method performs better than standard C4.5's pruned and unpruned trees.

关键词： approximate match feature generation inductive logic programming backpropagation neural networks

来源：评论

学校读者我要写书评

暂无评论

Warmr: a data mining tool for chemical data

引用

JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN 2001年第2期15卷 173-181页

作者： King, RD Srinivasan, A Dehaspe, L Univ Wales Dept Comp Sci Aberystwyth SY23 3DB Dyfed Wales Univ Oxford Comp Lab Oxford OX1 3QD England

Data mining techniques are becoming increasingly important in chemistry as databases become too large to examine manually. Data mining methods from the field of inductive logic programming (ILP) have potential advantages for structural chemical data. In this paper we present Warmr, the first ILP data mining algorithm to be applied to chemoinformatic data. We illustrate the value of Warmr by applying it to a well studied database of chemical compounds tested for carcinogenicity in rodents. Data mining was used to find all frequent substructures in the database, and knowledge of these frequent substructures is shown to add value to the database. One use of the frequent substructures was to convert them into probabilistic prediction rules relating compound description to carcinogenesis. These rules were found to be accurate on test data, and to give some insight into the relationship between structure and activity in carcinogenesis. The substructures were also used to prove that there existed no accurate rule, based purely on atom-bond substructure with less than seven conditions, that could predict carcinogenicity. This results put a lower bound on the complexity of the relationship between chemical structure and carcinogenicity. Only by using a data mining algorithm, and by doing a complete search, is it possible to prove such a result. Finally the frequent substructures were shown to add value by increasing the accuracy of statistical and machine learning programs that were trained to predict chemical carcinogenicity. We conclude that Warmr, and ILP data mining methods generally, are an important new tool for analysing chemical databases.

关键词： carcinogenesis chemical structure inductive logic programming machine learning predictive toxicology

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：