the search for frequent patterns in transactional databases is considered one of the most important datamining problems. Several parallel and sequential algorithms have been proposed in the literature to solve this p...
详细信息
In this paper, we present a novel active learning strategy, named dynamic active learning with SVM to improve the effectiveness of learning sample selection in active learning. the algorithm is divided into two steps....
详细信息
Estimation of probability density functions based on available data is important problem arising in various fields, such as telecommunications, machinelearning, datamining, patternrecognition and computer vision. I...
详细信息
Estimation of probability density functions based on available data is important problem arising in various fields, such as telecommunications, machinelearning, datamining, patternrecognition and computer vision. In this paper, we consider Kernel-based non-parametric density estimation methods and derive formulae for variable kernel density estimation using generalized, elliptic Gaussian kernels. the proposed technique is verified on simulated data.
Deciding the convex separability of the classes is an interesting question in the data exploration phase of building classification systems. In this paper we propose an efficient algorithm for deciding the convex sepa...
详细信息
Deciding the convex separability of the classes is an interesting question in the data exploration phase of building classification systems. In this paper we propose an efficient algorithm for deciding the convex separability of two point sets in R d . We compare our algorithm with conventional methods on 6 benchmark problems, and demonstrate that our algorithm is significantly faster.
Protein mass spectra patternrecognition is a new forum, in which many machinelearning algorithms have been conducted to enhance the chance of early cancer diagnosis. the high-dimensionality-small-sample (HDSS) probl...
详细信息
Protein mass spectra patternrecognition is a new forum, in which many machinelearning algorithms have been conducted to enhance the chance of early cancer diagnosis. the high-dimensionality-small-sample (HDSS) problem of cancer proteomic datasets still requires more sophisticated approaches to improve the classification accuracy. In this study we present a simple ensemble strategy based on measuring the generalizing capability of different subsets of training data and apply it in making final decision. Using a limited number of biomarkers along with 5 classification algorithms, the proposed method achieved a promising performance over a well-known prostate cancer mass spectroscopy dataset.
the three volume set LNCS 4491/4492/4493 constitutes the refereed proceedings of the 4thinternational Symposium on Neural Networks, ISNN 2007, held in Nanjing, China in June 2007. the 262 revised long papers and 192 ...
详细信息
ISBN:
(数字)9783540723837
ISBN:
(纸本)9783540723820
the three volume set LNCS 4491/4492/4493 constitutes the refereed proceedings of the 4thinternational Symposium on Neural Networks, ISNN 2007, held in Nanjing, China in June 2007. the 262 revised long papers and 192 revised short papers presented were carefully reviewed and selected from a total of 1.975 submissions. the papers are organized in topical sections on neural fuzzy control, neural networks for control applications, adaptive dynamic programming and reinforcement learning, neural networks for nonlinear systems modeling, robotics, stability analysis of neural networks, learning and approximation, datamining and feature extraction, chaos and synchronization, neural fuzzy systems, training and learning algorithms for neural networks, neural network structures, neural networks for patternrecognition, SOMs, ICA/PCA, biomedical applications, feedforward neural networks, recurrent neural networks, neural networks for optimization, support vector machines, fault diagnosis/detection, communications and signal processing, image/video processing, and applications of neural networks.
Feature selection attracted much interest from researchers in many fields such as network security, patternrecognition and datamining. In this paper, we present a wrapper-based feature selection algorithm aiming at ...
详细信息
ISBN:
(纸本)9781424412297;1424412293
Feature selection attracted much interest from researchers in many fields such as network security, patternrecognition and datamining. In this paper, we present a wrapper-based feature selection algorithm aiming at modeling lightweight intrusion detection system (IDS) by (1) using modified random mutation hill climbing (MRMHC) as search strategy to specify a candidate subset for evaluation; (2) using support vector machines (SVMs) as wrapper approach to obtain the optimum feature subset. We have examined the feasibility of our feature selection algorithm by conducting several experiments on KDD 1999 intrusion detection dataset which was categorized as DOS, PROBE, R2L and U2R. the experimental results show that our approach is able not only to speed up the process of selecting important features but also to guarantee high detection rates. Furthermore, our experiments indicate that intrusion detection system with a combination of our proposed approach has smaller computational resources than that with GA-SVM which is a popular feature selection algorithm in the field.
One of the popular trends in computer science has been development of intelligent web-based systems. Demand for such systems forces designers to make use of knowledge discovery techniques on web server logs. Web usage...
详细信息
ISBN:
(纸本)1424402115
One of the popular trends in computer science has been development of intelligent web-based systems. Demand for such systems forces designers to make use of knowledge discovery techniques on web server logs. Web usage mining has become a major area of knowledge discovery on World Wide Web. Frequent pattern discovery is one of the main issues in web usage mining. these frequent patterns constitute the basic information source for intelligent web-based systems. In this paper;frequent patternmining algorithms for web log data and their performance comparisons are examined. Our study is mainly focused on finding suitable patternmining algorithms for web server logs.
mining of sequential patterns is an important issue among the various datamining problems. the problem of incremental mining of sequential patterns deserves as much attention. In this paper, we consider the problem o...
详细信息
ISBN:
(纸本)3540335846
mining of sequential patterns is an important issue among the various datamining problems. the problem of incremental mining of sequential patterns deserves as much attention. In this paper, we consider the problem of the incremental updating of sequential patternmining when some transactions and/or data sequences are deleted from the original sequence database. We present a new algorithm, called IU_D, for mining frequent sequences so as to make full use of information obtained during an earlier mining process for reducing the cost of finding new sequential patterns in the updated database. the results of our experiment show that the algorithm performs significantly faster than the naive approach of miningthe entire updated database from scratch.
the nearest-neighbor (NN) classifier has long been used in patternrecognition, exploratory data analysis, and datamining problems. A vital consideration in obtaining good results withthis technique is the choice of...
详细信息
ISBN:
(纸本)9780769527017
the nearest-neighbor (NN) classifier has long been used in patternrecognition, exploratory data analysis, and datamining problems. A vital consideration in obtaining good results withthis technique is the choice of distance function, and correspondingly which features to consider when computing distances between samples. In this paper a new ensemble technique is proposed to improve the performance of NN classifier the proposed approach combines multiple NN classifiers, where each classifier uses a different distance function and potentially a different set of features (feature vector). these feature vectors are determined for each distance metric using Simple Voting Scheme incorporated in Tabu Search (TS). the proposed ensemble classifier with different distance metrics and different feature vectors (TS-DF/NN) is evaluated using various benchmark data sets from UCI machinelearning Repository. Results have indicated a significant increase in the performance when compared with various well-known classifiers. Furthermore, the proposed ensemble method is also compared with ensemble classifier using different distance metrics but with same feature vector (with or without Feature Selection (FS)).
暂无评论