As nonlinear feature extraction methods, kernel methods have been widely applied in patternrecognition. However, for high dimensional data such as face images, a kernel method will correspond to a high computational ...
详细信息
ISBN:
(纸本)1424400600
As nonlinear feature extraction methods, kernel methods have been widely applied in patternrecognition. However, for high dimensional data such as face images, a kernel method will correspond to a high computational cost. In this paper, a novel idea and framework are presented to implement the kernel methods on high-dimensional data. A remarkable character of the framework is that there are two feature extraction processes. the first feature extraction process is performed to transform high dimensional samples into low dimensional data. And, the second feature extraction process is implemented based on the obtained low dimensional data. Withthe novel framework, the kernel methods will become much efficient. Moreover, all kernel methods can work withthe framework. the experiments on face images show the validity of this framework. Further more, withthis framework, kernel methods can achieve higher classification accuracies in comparison withthe naive kernel methods.
Frequent closed itemsets mining has become an important alternative of association rule mining recently. CloSET+ is a efficient algorithm to find Frequent closed itemsets without candidate generation. However, CloSET+...
详细信息
ISBN:
(纸本)1424400600
Frequent closed itemsets mining has become an important alternative of association rule mining recently. CloSET+ is a efficient algorithm to find Frequent closed itemsets without candidate generation. However, CloSET+ must scan database two times. In order to enhance the efficiency of CloSET+ algorithms and reduce the I/O cost of database scanning in frequent closed itemsets mining, propose a novel algorithm called QCloSET+ which can mining Frequent closed itemsets with one database scanning.
Outlier detection is one of the branches of datamining, with important applications in the domains of finance fraud detection, network intrusion analysis and so on. But most applications are high dimensional domains....
详细信息
ISBN:
(纸本)1424400600
Outlier detection is one of the branches of datamining, with important applications in the domains of finance fraud detection, network intrusion analysis and so on. But most applications are high dimensional domains. Many algorithms use the concept of proximity to find outliers based on the relationship to the data set. However, the sparsity of high dimensional points results to the algorithms are not available for high dimensional space. In this paper, we discuss a new technique ODHDP(Outlier Detection in High Dimension based on Projection) which finds the outliers based on projection from the data set.
Recently several manifold learning algorithms have been presented for nonlinear dimensionality reduction. Isomap is one of them. However, Isomap suffers from a deficiency that it does not give an explicit mapping func...
详细信息
ISBN:
(纸本)1424400600
Recently several manifold learning algorithms have been presented for nonlinear dimensionality reduction. Isomap is one of them. However, Isomap suffers from a deficiency that it does not give an explicit mapping function, which is from high dimensional space to low dimensional target space. In this paper, a version of Isomap with explicit mapping, called E-Isomap, is proposed. In E-Isomap, the geodesic distance matrix is fed into a cost function and then Iterative Majorization is adopted to solve an optimization problem for obtaining boththe low dimensional configuration and the nonlinear mapping. Owing to the existence of explicit mapping, this version of Isomap can be more easily used in patternrecognitionthan the original ones. the experiments on two benchmark data sets are given to demonstrate the performance of the presented method.
Privacy preserving datamining is a novel research direction in datamining and statistical databases, where datamining algorithms are analyzed for the side-effects they incur in data privacy. there have been many st...
详细信息
ISBN:
(纸本)1424400600
Privacy preserving datamining is a novel research direction in datamining and statistical databases, where datamining algorithms are analyzed for the side-effects they incur in data privacy. there have been many studies on efficient discovery of frequent itemsets in privacy preserving datamining. However, it is nontrivial to maintain such discovered frequent itemsets because a database may allow frequent itemsets updates and such frequent itemsets may be turned into infrequent itemsets. In this paper, an incremental updating algorithm IPPFIM is proposed for efficient maintenance of discovered frequent itemsets when new transaction data are added to a transaction database in privacy preserving. the algorithm makes use of previous mining results to cut down the cost of finding new frequent itemsets In an updated database, the performance evaluation shows the efficiency of this method.
In manufacturing processes it is very important that the condition of the cutting tool, particularly the indications when it should be changed, can be monitored. Cutting tool condition monitoring is a very complex pro...
详细信息
ISBN:
(纸本)1424400600
In manufacturing processes it is very important that the condition of the cutting tool, particularly the indications when it should be changed, can be monitored. Cutting tool condition monitoring is a very complex process and thus sensor fusion techniques and artificial intelligence signal processing algorithms are employed in this study. the multi-sensor signals reflect the tool condition comprehensively. A unique fuzzy neural hybrid patternrecognition algorithm has been developed. the weighted approaching degree can measure the difference of signal features accurately and the neurofuzzy network combines the transparent representation of fuzzy system withthe learning ability of neural networks. the algorithm has strong modeling and noise suppression ability. these leads to successful tool wear classification under a range of machining conditions.
Currently, datamining in data stream becomes a very popular research field. One of the central tasks in miningdata streams is that of identifying outliers which can lead to discovering unexpected and interesting kno...
详细信息
ISBN:
(纸本)1424400600
Currently, datamining in data stream becomes a very popular research field. One of the central tasks in miningdata streams is that of identifying outliers which can lead to discovering unexpected and interesting knowledge, which is critical important. To effectively mine outliers in data stream, ODABK, an algorithm for outlier detection in data stream is presented. It is based on KNN and significantly enhanced by means of other data structures and its optimized logical operations. Finally, the paper reports experiments on a real-world census data which show that ODABK is more effective in detection rate and execution times.
Finding the co-location patterns for spatial data is a challenging problem in spatial databases. While previous work focused on the discovery of co-location patterns for categorical data, we present a novel method tha...
详细信息
ISBN:
(纸本)1424400600
Finding the co-location patterns for spatial data is a challenging problem in spatial databases. While previous work focused on the discovery of co-location patterns for categorical data, we present a novel method that finds co-location patterns in spatial continuous data. Our algorithm mines the co-location patterns for continuous data by using a multi-layer index and neighbor domain set which resembles with item-set of transactions in classical datamining. We conduct experiments withthe fire data and the results indicate that the new algorithm is very effective.
In this paper, an Apriori algorithm is presented for mining frequent patterns based on inverted list. Compared with traditional Apriori algorithm and FP-growth algorithm, this algorithm has better efficiency and wider...
详细信息
ISBN:
(纸本)1424400600
In this paper, an Apriori algorithm is presented for mining frequent patterns based on inverted list. Compared with traditional Apriori algorithm and FP-growth algorithm, this algorithm has better efficiency and wider application range. Aimed at reducing the defect of traditional Apriori algorithm, this algorithm avoids lots of redundant operations with inverted list. this algorithm only needs scan data set twice and don't need joining and pruning operations. Frequent item set is saved in each transaction frequent set TF, and insert next frequent single item one by one,then generate new possible frequent item set. In this way, lots of redundant operations can be reduced. the performance study shows that it is more efficient in both dense datasets and sparse datasets.
Face recognition using labeled and unlabelled data has received considerable amount of interest in the past years. In the same time, multiple classifier systems (MCS) have been widely successful in various pattern rec...
详细信息
ISBN:
(纸本)9608457564
Face recognition using labeled and unlabelled data has received considerable amount of interest in the past years. In the same time, multiple classifier systems (MCS) have been widely successful in various patternrecognition applications such as face recognition. MCS have been very recently investigated in the context of semi-supervised learning. Very few attention has been devoted to verifying the usefulness of the newly developed semi-supervised MCS models for face recognition. In this work we attempt to access and compare the performance of several semi-supervised MCS training algorithms when applied to the face recognition problem. Experiments on a data set of face images are presented. Our experiments use non-homogenous classifier ensemble, majority voting rule and compare between a three semi-supervised learning models: the self-trained single classifier model, the ensemble driven model and a newly proposed modified co-training model. Experimental results reveal that the investigated semi-supervised models are successful in the exploitation of unlabelled data to enhance the classifier performance and their combined output. the proposed semi-supervised learning model has shown a significant improvement of the classification accuracy compared to existing models.
暂无评论