Support Vector machine (SVM) is a kind of machinelearning method based on the statistical learning theory, it has been applied in the fault diagnosis field. After analyzing SVM pattern classification theory, a hierar...
详细信息
ISBN:
(纸本)0769528759
Support Vector machine (SVM) is a kind of machinelearning method based on the statistical learning theory, it has been applied in the fault diagnosis field. After analyzing SVM pattern classification theory, a hierarchical structure Fault Detection and Identification (FDI) system is presented in this paper, and simulation results show that this method can effectively handle the complex process characteristic and improve FDI model performance.
data analysis methods and techniques are revisited in the case of biological data sets. Particular emphasis is given to clustering and mining issues. Clustering is still a subject of active research in several fields ...
详细信息
ISBN:
(纸本)9783540770459
data analysis methods and techniques are revisited in the case of biological data sets. Particular emphasis is given to clustering and mining issues. Clustering is still a subject of active research in several fields such as statistics, patternrecognition, and machinelearning. datamining adds to clustering the complications of very large data-sets with many attributes of different types. And this is a typical situation in biology. Some cases studies are also described.
This paper presents a data preprocessing procedure to select support vector (SV) candidates. We select decision boundary region vectors (BRVs) as SV candidates. Without the need to use the decision boundary, BRVs can ...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
This paper presents a data preprocessing procedure to select support vector (SV) candidates. We select decision boundary region vectors (BRVs) as SV candidates. Without the need to use the decision boundary, BRVs can be selected based on a vector's nearest neighbor of opposite class (NNO). To speed up the process, two spatial approximation sample hierarchical (SASH) trees are used for estimating the BRVs. Empirical results show that our data selection procedure can reduce a full dataset to the number of SVs or only slightly higher. Training with the selected subset gives performance comparable to that of the full dataset. For large datasets, overall time spent in selecting and training on the smaller dataset is significantly lower than the time used in training on the full dataset.
Clustering technique is an important tool for data analysis and has a promising prospect in datamining, patternrecognition, etc. Usually, objects in clustering analysis are of vectors, which consist of some features...
详细信息
ISBN:
(纸本)0769528759
Clustering technique is an important tool for data analysis and has a promising prospect in datamining, patternrecognition, etc. Usually, objects in clustering analysis are of vectors, which consist of some features.. They may be represented as points in Euclidean space. However, in some tasks, objects in clustering analysis may be some abstract models other than data points, for example neural networks, decision trees, support vector machines, etc. By defining the extended distance (in real tasks, there are some different definition forms about distance), clustering method is studied for the abstract data objects. Framework of clustering algorithm for objects of models is presented As its application, a method for improving diversity of ensemble learning with neural networks is investigated. The relations between the number of clusters in clustering analysis, the size of ensemble learning, and performance of ensemble learning are studied by experiments.
Biometric data like fingerprints are often highly structured and of high dimension. The "curse of dimensionality" poses great challenge to subsequent patternrecognition algorithms including neural networks ...
详细信息
ISBN:
(数字)9783540738718
ISBN:
(纸本)9783540738701
Biometric data like fingerprints are often highly structured and of high dimension. The "curse of dimensionality" poses great challenge to subsequent patternrecognition algorithms including neural networks due to high computational complexity. A common approach is to apply dimensionality reduction (DR) to project the original data onto a lower dimensional space that preserves most of the useful information. Recently, we proposed Twin Kernel Embedding (TKE) that processes structured or non-vectorial data directly without vectorization. Here, we apply this method to clustering and visualizing fingerprints in a 2-dimensional space. It works by learning an optimal kernel in the latent space from a distance metric defined on the input fingerprints instead of a kernel. The outputs are the embeddings of the fingerprints and a kernel Gram matrix in the latent space that can be used in subsequent learning procedures like Support Vector machine (SVM) for classification or recognition. Experimental results confirmed the usefulness of the proposed method.
During the past number of years, machinelearning and datamining techniques have received considerable attention among the intrusion detection researchers to address the weaknesses of knowledgebase detection techniqu...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
During the past number of years, machinelearning and datamining techniques have received considerable attention among the intrusion detection researchers to address the weaknesses of knowledgebase detection techniques. This has led to the application of various supervised and unsupervised techniques for the purpose of intrusion detection. In this paper, we conduct a set of experiments to analyze the performance of unsupervised techniques considering their main design choices. These include the heuristics proposed for distinguishing abnormal data from normal data and the distribution of dataset used for training. We evaluate the performance of the techniques with various distributions of training and test datasets, which are constructed from KDD99 dataset, a widely accepted resource for IDS evaluations. This comparative study is not only a blind comparison between unsupervised techniques, but also gives some guidelines to researchers and practitioners on applying these techniques to the area of intrusion detection.
Dimension reduction methods are often applied in machinelearning and datamining problems. Linear subspace methods are the commonly used ones, such as principal component analysis (PCA), Fisher's linear discrimin...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Dimension reduction methods are often applied in machinelearning and datamining problems. Linear subspace methods are the commonly used ones, such as principal component analysis (PCA), Fisher's linear discriminant analysis (FDA), et al. In this paper, we describe a novel feature extraction method for binary classification problems. Instead of finding linear subspaces, our method finds lower-dimensional affine subspaces for data observations. Our method can be understood as a generalization of the Fukunaga-Koontz Transformation. We show that the proposed method has a closed-form solution and thus can be solved very efficiently. Also we investigate the information-theoretical properties of the new method and study the relationship of our method with other methods. The experimental results show that our method, as PCA and FDA, can be used as another preliminary data-exploring tool to help solve machinelearning and datamining problems.
A Reflex Fuzzy Min-Max Neural Network (RFMN) capable of learning from missing data is presented. Many real world problems involve machine leaning with missing values or attributes. Thus, learning with missing or incom...
详细信息
The work presented here focuses on combining multiple classifiers to form single classifier for pattern classification, machinelearning for expert system, and datamining tasks. The basis of the combination is that e...
详细信息
ISBN:
(纸本)9783540770459
The work presented here focuses on combining multiple classifiers to form single classifier for pattern classification, machinelearning for expert system, and datamining tasks. The basis of the combination is that efficient concept learning is possible in many cases when the concepts learned from different approaches are combined to a more efficient concept. The experimental result of the algorithm, EMRL in a representative collection of different domain shows that it performs significantly better than the several state-of-the-art individual classifier, in case of 11 domains out of 25 data sets whereas the state-of-the-art individual classifier performs significantly better than EMRL only in 5 cases.
Description logics have emerged as one of the most successful formalisms for knowledge representation and reasoning. They are now widely used as a basis for ontologies in the Semantic Web. To extend and analyse ontolo...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Description logics have emerged as one of the most successful formalisms for knowledge representation and reasoning. They are now widely used as a basis for ontologies in the Semantic Web. To extend and analyse ontologies, automated methods for knowledge acquisition and mining are being sought for. Despite its importance for knowledge engineers, the learning problem in description logics has not been investigated as deeply as its counterpart for logic programs. We propose the novel idea of applying evolutionary inspired methods to solve this task. In particular, we show how Genetic Programming can be applied to the learning problem in description logics and combine it with techniques from Inductive Logic Programming. We base our algorithm on thorough theoretical foundations and present a preliminary evaluation.
暂无评论