检索结果-内蒙古大学图书馆

8th european conference on principles and practice of knowledge discovery in databases, PKDD 2004

作者： Hilario, Melanie Mitchell, Alex Kim, Jee-Hyub Bradley, Paul Attwood, Terri Artificial Intelligence Laboratory University of Geneva Switzerland European Bioinformatics Institute Hinxton CambridgeCB10 1SD United Kingdom School of Biological Sciences University of Manchester United Kingdom

ISBN: (纸本)3540231080

Protein fingerprints are groups of conserved motifs which can be used as diagnostic signatures to identify and characterize collections of protein sequences. these fingerprints are stored in the prints database after time-consuming annotation by domain experts who must first of all determine the fingerprint type, i.e., whether a fingerprint depicts a protein family, super family or domain. To alleviate the annotation bottleneck, a system called PRECIS has been developed which automatically generates prints records, provisionally stored in a supplement called preprints. One limitation of PRECIS is that its classification heuristics, hand coded by proteomics experts, often misclassify fingerprint type;their error rate has been estimated at 40%. this paper reports on an attempt to build more accurate classifiers based on information drawn from the fingerprints themselves and from the SWISS-PROT database. Extensive experimentation using 10-fold cross-validation led to the selection of a model combining the Relief F feature selector with an SVM-RBF learner. the final model’s error rate was estimated at 14.1% on a blind test set, representing a 26% accuracy gain over PRECIS’ handcrafted rules. © Springer-Verlag Berlin Heidelberg 2004.

关键词： Proteins

来源：评论

学校读者我要写书评

暂无评论

Learning characteristic rules relying on quantified paths

引用

7th european conference on principles and practice of knowledge discovery in databases

作者： Turmeaux, T Salleb, A Vrain, C Cassard, D Univ Orleans LIFO F-45067 Orleans 2 France Bur Rech Geol & Minieres Orleans 2 France

ISBN: (纸本)3540200851

In this paper, we address the characterization task and we present a general framework for the characterization of a target set of objects by means of their own properties, but also the properties of objects linked to them. According to the kinds of objects, various links can be considered. For instance, in the case of relational databases, associations are the straightforward links between pairs of tables. We propose Caracterix, a new algorithm for mining characterization rules and we show how it can be used on multi-relational and spatial databases.

关键词： machine learning inductive logic programming data mining characteristic rules relational databases spatial databases

来源：评论

学校读者我要写书评

暂无评论

Efficiently finding arbitrarily scaled patterns in massive time series databases

引用

7th european conference on principles and practice of knowledge discovery in databases

作者： Keogh, E Univ Calif Riverside Dept Comp Sci & Engn Riverside CA 92521 USA

ISBN: (纸本)3540200851

the problem of efficiently finding patterns in massive time series databases has attracted great interest, and, at least for the Euclidean distance measure, may now be regarded as a solved problem. However in recent years there has been an increasing awareness that Euclidean distance is inappropriate for many real world applications. the limitations of Euclidean distance stems from the fact that it is very sensitive to distortions in the time axis. A partial solution to this problem, Dynamic Time Warping (DTW), aligns the time axis before calculating the Euclidean distance. However, DTW can only address the problem of local scaling. As we demonstrate in this work, uniform scaling may be just as important in many domains, including applications as diverse as bioinformatics, space telemetry monitoring and motion editing for computer animation. In this work, we demonstrate a novel technique to speed up similarity search under uniform scaling. As we will demonstrate, our technique is simple and intuitive, and can achieve a speedup of 2 to 3 orders of magnitude under realistic settings.

关键词： Time series

来源：评论

学校读者我要写书评

暂无评论

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

引用

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

ISBN: (纸本)3540440372

the proceedings contain 43 papers. the special focus in this conference is on principles of Data Mining and knowledge discovery. the topics include: Optimized substructure discovery for semi-structured data;fast outlier detection in high dimensional spaces;fast algorithms for mining emerging patterns;on the discovery of weak periodicities in large time series;the need for low bias algorithms in classification learning from large data sets;mining all non-derivable frequent itemsets;iterative data squashing for boosting based on a distribution-sensitive distance;finding association rules with some very frequent attributes;self-aggregation in scaled principal component space;a classification approach for prediction of target events in temporal sequences;privacy-oriented data mining by proof checking;an empirical study of feature selection metrics for text classification;generating actionable knowledge by expert-guided subgroup discovery;multiscale comparison of temporal patterns in time-series medical databases;association rules for expressing gradual dependencies;support approximations using bonferroni-type inequalities;comparing two-phase rule induction to cost-sensitive boosting;dependency detection in mobimine and random matrices;long-term learning for web search engines;spatial subgroup mining integrated in an object-relational spatial database;involving aggregate functions in multi-relational search;information extraction in structured documents using tree automata induction;algebraic techniques for analysis of large discrete-valued datasets;geography of differences between two classes of data;rule induction for classification of gene expression array data and iteratively selecting feature subsets for mining from high-dimensional databases.

关键词：

来源：评论

学校读者我要写书评

暂无评论

5th european conference on principles of knowledge discovery in databases

引用

the knowledge Engineering Review 2003年第2期17卷 197-203页

作者： FRANS COENEN

1 OpeningPKDD 2001, the 5th european conference on principles of knowledge discovery in databases (PKDD), was held in Freiburg, Baden-Württemberg, Germany, this year (Monday 3 to thursday 7 September), and co-located with the 12th european conference on Machine Learning (ECML 2001). the proceedings comprised two volumes, one for PKDD (De Raedt & Siebes, 2001) and one for ECML (De Raedt & Flach, 2001); and form part of the Springer Lecture Notes on Artificial Intelligence (LNAI) series. the conference was held in the University buildings in the centre of the old town. Freiburg and the surrounding area were for many years part of the Austro-Hungarian empire and thus the university was described to us as being one of the oldest Austrian Universities.

关键词：

来源：评论

学校读者我要写书评

暂无评论

principles of data mining and knowledge discovery 6th

引用

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

作者： Elomaa, Tapio Mannila, Heikki Toivonen, Hannu University of Helsinki Deapartment of Computer Science P.O. Box 26 Helsinki00014 Finland

来源：评论

学校读者我要写书评

暂无评论

principles of Data Mining and knowledge discovery - 6th european conference, PKDD 2002, Proceedings

引用

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

ISBN: (纸本)3540440372

the proceedings contain 41 papers. the topics discussed include: optimized substructure discovery for semi-structured data;fast outlier detection in high dimensional spaces;data mining in schizophrenia research - preliminary analysis;fast algorithms for mining emerging patterns;on the discovery of weak periodicities in large time series;the need for low bias algorithms in classification learning from large data sets;mining all non-derivable frequent itemsets;iterative data squashing for boosting based on a distribution- sensitive distance;finding association rules with some very frequent attributes;unsupervised learning: self-aggregation in scaled principal component space;a classification approach for prediction of target events in temporal sequences;privacy-oriented data mining by proof checking;and choose your words carefully: an empirical study of feature selection metrics for text classification.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A scalable constant-memory sampling algorithm for pattern discovery in large databases

引用

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

作者： Scheffer, Tobias Wrobel, Stefan University of Magdeburg FIN/IWS P.O. Box 4120 Magdeburg39016 Germany FhG AiS Schloß Birlinghoven Sankt Augustin53754 Germany University of Bonn Informatik III Römerstr. 164 Bonn53117 Germany

ISBN: (纸本)3540440372

Many data mining tasks can be seen as an instance of the problem of finding the most interesting (according to some utility function) patterns in a large database. In recent years, significant progress has been achieved in scaling algorithms for this task to very large databases through the use of sequential sampling techniques. However, except for sampling-based greedy algorithms which cannot give absolute quality guarantees, the scalability of existing approaches to this problem is only with respect to the data, not with respect to the size of the pattern space: it is universally assumed that the entire hypothesis space fits in main memory. In this paper, we describe how this class of algorithms can be extended to hypothesis spaces that do not fit in memory while maintaining the algorithms’ precise Ε−δ quality guarantees. We present a constant memory algorithm for this task and prove that it possesses the required properties. In an empirical comparison, we compare variable memory and constant memory sampling. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Database systems

来源：评论

学校读者我要写书评

暂无评论

Multiscale comparison of temporal patterns in time-series medical databases

引用

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

作者： Hirano, Shoji Tsumoto, Shusaku Department of Medical Informatics Shimane Medical University School of Medicine 89–1 Enya-cho IzumoShimane693–8501 Japan

ISBN: (纸本)3540440372

this paper presents a method for analyzing time-series data on laboratory examinations based on phase-constraint multiscale matching and rough clustering. Multiscale matching compares two subsequences throughout various scales of view. It has an advantage of preserving connectivity of subsequences even if the subsequences are represented at different scales. Rough clustering groups up objects according not to the topographic measures such as the center or deviance of objects in a cluster but to the relative similarity and indiscernibility of objects. We use multiscale matching to obtain similarity of sequences and rough clustering to cluster the sequences according to the obtained similarity. We slightly modified dissimilarity measure in multiscale matching so that it suppresses excessive shift of phase that may cause incorrect matching of the sequences. Experimental results on the hepatitis dataset show that the proposed method successfully clustered similar sequences into an independent cluster, and that correspondence of subsequences are also successfully captured. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Data mining

来源：评论

学校读者我要写书评

暂无评论

Efficiently mining approximate models of associations in evolving databases

引用

6th european conference on principles and practice of knowledge discovery in databases, PKDD 2002

作者： Veloso, Adriano Gusmão, Bruno Meira, Wagner Carvalho, Marcio Parthasarathy, Srini Zaki, Mohammed Computer Science Department Universidade Federal de Minas Gerais Brazil Department of Computer and Information Science Ohio-State University United States Computer Science Department Rensselaer Polytechnic Institute United States

ISBN: (纸本)3540440372

Much of the existing work in machine learning and data mining has relied on devising efficient techniques to build accurate models from the data. Research on how the accuracyof a model changes as a function of dynamic updates to the databases is very limited. In this work we show that extracting this information: knowing which aspects of the model are changing;and how theyare changing as a function of data updates;can be verye ffective for interactive data mining purposes (where response time is often more important than model qualityas long as model qualityi s not too far off the best (exact) model. In this paper we consider the problem of generating approximate models within the context of association mining, a keyda ta mining task. We propose a new approach to incrementallyg enerate approximate models of associations in evolving databases. Our approach is able to detect how patterns evolve over time (an interesting result in its own right), and uses this information in generating approximate models with high accuracy at a fraction of the cost (of generating the exact model). Extensive experimental evaluation on real databases demonstrates the effectiveness and advantages of the proposed approach. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Data mining

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：