the proceedings contain 60 papers. the topics discussed include: predicting software suitability using a Bayesian belief network;parallel algorithm for control chart patternrecognition;data-centric automated data min...
详细信息
ISBN:
(纸本)0769524958
the proceedings contain 60 papers. the topics discussed include: predicting software suitability using a Bayesian belief network;parallel algorithm for control chart patternrecognition;data-centric automated datamining;a Bayesian kernel for the prediction of neuron properties from binary gene profiles;new filter-based feature selection criteria for identifying differentially expressed genes;a new clustering algorithm using message passing and its applications in analyzing microarray data;iterative weighting of phylogenetic profiles increases classification accuracy;integrating knowledge-driven and data-driven approaches for the derivation of clinical prediction rules;sparse classifiers for automated heart wall motion abnormality detection;segmenting brain tumors using alignment-based features;and the application of machinelearning techniques to the prediction of erectile dysfunction.
Ranking is an important task in datamining and knowledge discovery. We propose a novel approach called PECS algorithm to improve the overall ranking performance of a given ensemble. We formally analyse the sufficient...
详细信息
Many applications track the movement of mobile objects, which can be represented as sequences of timestamped locations. Given such a spatio-temporal series, we study the problem of discovering sequential patterns, whi...
详细信息
According to the sizes of the attribute set and the information table, the information tables are categorized into three types of Rough Set problems, patternrecognition/machinelearning problems, and Statistical Mode...
详细信息
ISBN:
(纸本)3540286535
According to the sizes of the attribute set and the information table, the information tables are categorized into three types of Rough Set problems, patternrecognition/machinelearning problems, and Statistical Model Identification problems. In the first Rough Set situation, what we have seen is as follows: 1) the "granularity" should be taken so as to divide equally the unseen tuples out of the information table, 2) the traditional "Reduction" sense accords withthe above insistence, and 3) the "stable" subsets of tuples, which are defined through a "Galois connection" between the subset and the corresponding attribute subset, may play an important role to capture some characteristics that can be read from the given information table. We show these with some illustrative examples.
the problem of record linkage focuses on determining whether two object descriptions refer to the same underlying entity. Addressing this problem effectively has many practical applications, e.g., elimination of dupli...
详细信息
ISBN:
(纸本)0769522785
the problem of record linkage focuses on determining whether two object descriptions refer to the same underlying entity. Addressing this problem effectively has many practical applications, e.g., elimination of duplicate records in databases and citation matching for scholarly articles. In this paper we consider a new domain where the record linkage problem is manifested: Internet comparison shopping. We address the resulting linkage setting that requires learning a similarity function between record pairs from streaming data. the learned similarity function is subsequently used in clustering to determine which records are co-referent and should be linked. We present an online machinelearning method for addressing this problem, where a composite similarity function based on a linear combination of basis functions is learned incrementally. We illustrate the efficacy of this approach on several real-world datasets from an Internet comparison shopping site, and show that our method is able to effectively learn various distance functions for product data with differing characteristics. We also provide experimental results that show the importance of considering multiple performance measures in record link-age evaluation.
Spatial co-location patterns represent the subsets of features whose instances are frequently located together in geographic space. Co-location pattern discovery presents challenges since the instances of spatial feat...
详细信息
We estimate the speed of texture change by measuring the spread of texture vectors in their feature space. this method allows us to robustly detect even very slow moving objects. By learning a normal amount of texture...
详细信息
In this paper we address confidentiality issues in distributed data clustering, particularly the inference problem. We present a measure of inference risk as a function of reconstruction precision and number of collud...
详细信息
Sequential patternmining is an important datamining problem with broad applications. Especially, it is also an interesting problem in virtual environments. In this paper, we propose a projection-based, sequential pa...
详细信息
暂无评论