By identifying characteristic regions in which classes are dense and also relevant for discrimination a new, intuitive classification method is set up. this method enables a visualized result so the user is provided w...
详细信息
ISBN:
(纸本)3540269231
By identifying characteristic regions in which classes are dense and also relevant for discrimination a new, intuitive classification method is set up. this method enables a visualized result so the user is provided with an insight into the data with respect to discrimination for an easy interpretation. Additionally, it outperforms Decision trees in a lot of situations and is robust against outliers and missing values.
In this work, we proposes a novel method for mining frequent disjunctive patterns on single data sequence. For this purpose, we introduce a sophisticated measure that satisfies anti-monotonicity, by which we can discu...
详细信息
ISBN:
(纸本)3540269231
In this work, we proposes a novel method for mining frequent disjunctive patterns on single data sequence. For this purpose, we introduce a sophisticated measure that satisfies anti-monotonicity, by which we can discuss efficient mining algorithm based on APRIORI. We discuss some experimental results.
Support Vector machines have received considerable attention from the patternrecognition community in recent years. they have been applied to various classical recognition problems achieving comparable or even superi...
详细信息
ISBN:
(纸本)3540269231
Support Vector machines have received considerable attention from the patternrecognition community in recent years. they have been applied to various classical recognition problems achieving comparable or even superior results to classifiers such as neural networks. We investigate the application of Support Vector machines (SVMs) to the problem of road recognition from remotely sensed images using edge-based features. We present very encouraging results from our experiments, which are comparable to decision tree and neural network classifiers.
this paper proposes an unsupervised algorithm for learning a finite Dirichlet mixture model. An important part of the unsupervised learning problem is determining the number of clusters which best describe the data. W...
详细信息
ISBN:
(纸本)3540269231
this paper proposes an unsupervised algorithm for learning a finite Dirichlet mixture model. An important part of the unsupervised learning problem is determining the number of clusters which best describe the data. We consider here the application of the Minimum Message length (MML) principle to determine the number of clusters. the Model is compared with results obtained by other selection criteria (AIC, MDL, MMDL, PC and a Bayesian method). the proposed method is validated by synthetic data and summarization of texture image database.
We present how the supervised machinelearning techniques can be used to predict quality characteristics in an important chemical engineering application: the wine distillate maturation process. A number of experiment...
详细信息
ISBN:
(纸本)3540269231
We present how the supervised machinelearning techniques can be used to predict quality characteristics in an important chemical engineering application: the wine distillate maturation process. A number of experiments have been conducted with six regression-based algorithms, where the M5' algorithm was proved to be the most appropriate for predicting the organoleptic properties of the matured wine distillates. the rules that are exported by the algorithm are as accurate as human expert's decisions.
Fast control chart patternrecognition aids in instantaneous detection of abnormal functioning of a system. In this paper, we present a parallel algorithm for fast control chart patternrecognition. It addresses three...
详细信息
machinelearning techniques are widely used in the analysis of biomedical datasets. Modern devices tend to produce voluminous, high-dimensional datasets for which medical practitioners require high-performance, user-f...
详细信息
this paper is concerned with time series of graphs and proposes a novel scheme that is able to predict the presence or absence of nodes in a graph. the proposed scheme is based on decision trees that are induced from ...
详细信息
ISBN:
(纸本)3540269231
this paper is concerned with time series of graphs and proposes a novel scheme that is able to predict the presence or absence of nodes in a graph. the proposed scheme is based on decision trees that are induced from a training set of sample graphs. the work is motivated by applications in computer network monitoring. However, the proposed prediction method is generic and can be used in other applications as well. Experimental results with graphs derived from real computer networks indicate that a correct prediction rate of up to 97% can be achieved.
In this paper we describe a new cluster model which is based on the concept of linear manifolds. the method identifies subsets of the data which are embedded in arbitrary oriented lower dimensional linear manifolds. M...
详细信息
ISBN:
(纸本)3540269231
In this paper we describe a new cluster model which is based on the concept of linear manifolds. the method identifies subsets of the data which are embedded in arbitrary oriented lower dimensional linear manifolds. Minimal subsets of points are repeatedly sampled to construct trial linear manifolds of various dimensions. Histograms of the distances of the points to each trial manifold are computed. the sampling corresponding to the histogram having the best separation between a mode near zero and the rest is selected and the data points are partitioned on the basis of the best separation. the repeated sampling then continues recursively on each block of the partitioned data. A broad evaluation of some hundred experiments over real and synthetic data sets demonstrates the general superiority of this algorithm over any of the competing algorithms in terms of stability, accuracy, and computation time.
Research in protein structure and function is one of the most important subjects in modem bioinformatics and computational biology. It often uses advanced data mining and machinelearning methodologies to perform pred...
详细信息
ISBN:
(纸本)3540269231
Research in protein structure and function is one of the most important subjects in modem bioinformatics and computational biology. It often uses advanced data mining and machinelearning methodologies to perform prediction or patternrecognition tasks. this paper describes a new method for prediction of protein secondary structure content based on feature selection and multiple linear regression. the method develops a novel representation of primary protein sequences based on a large set of 495 features. the feature selection task performed using very large set of nearly 6,000 proteins, and tests performed on standard non-homologues protein sets confirm high quality of the developed solution. the application of feature selection and the novel representation resulted in 14-15% error rate reduction when compared to results achieved when standard representation is used. the prediction tests also show that a small set of 5-25 features is sufficient to achieve accurate prediction for both helix and strand content for non-homologous proteins.
暂无评论