In practice, there are many imbalanced data classification problems, for example, spam filtering, credit card fraud detection and software defect prediction etc. it is important in theory as well as in application for...
详细信息
ISBN:
(纸本)9781538652152
In practice, there are many imbalanced data classification problems, for example, spam filtering, credit card fraud detection and software defect prediction etc. it is important in theory as well as in application for investigating the problem of imbalanced data classification. In order to deal with this problem, based on extreme learningmachine autoencoder, this paper proposed an approach for addressing the problem of binary imbalanced data classification. The proposed method includes 3 steps. (1) the positive instances are used as seeds, new samples are generated for increasing the number of positive instances by extreme learningmachine autoencoder, the generated new samples are similar with the positive instances but not same. (2) step (1) is repeated several times, and a balanced data set is obtained. (3) a classifier is trained with the balanced data set and used to classify unseen samples. The experimental results demonstrate that the proposed approach is feasible and effective.
In this paper, we study the structure of 3-Lie algebras with involutive derivations. We prove that if A is an m-dimensional 3-Lie algebra with an involutive derivation D, then there exists a compatible 3-pre-Lie algeb...
In this paper, we define a class of 3-algebraswhich are called 3-Lie-Rinehart algebras. A 3-Lie-Rinehart algebra is a triple (L, A, ρ), where A is a commutative associative algebra, L is an A-module, (A, ρ) is a 3-L...
For any n-dimensional 3-Lie algebra A over a field of characteristic zero with an involutive derivation D, we investigate the structure of the 3-Lie algebra B1 = A ad∗ A∗ associated with the coadjoint representation (...
This paper presents an approach to instance selection for the nearest neighbor rule which aims to obtain a condensed set with high condensing rate and prediction accuracy. By making an improvement on MCS algorithm and...
详细信息
This paper presents an approach to instance selection for the nearest neighbor rule which aims to obtain a condensed set with high condensing rate and prediction accuracy. By making an improvement on MCS algorithm and allowing certain error rate on the training set, a condensed set with high condensing rate and satisfying prediction accuracy is obtained. The condensed set is order-independent of the training instances and insensitive to noise. Comparative experiments have been conducted on real data sets, and the results show its superiority to MCS and FCNN in terms of condensing rate and prediction accuracy.
It has been shown that the fuzzy integral is an effective tool for the fusion of multiple classifiers. Of primary importance in the development of the system is the choice of the measure which embodies the importance ...
详细信息
It has been shown that the fuzzy integral is an effective tool for the fusion of multiple classifiers. Of primary importance in the development of the system is the choice of the measure which embodies the importance of subsets of classifiers. In this paper we propose a method for a dynamic fuzzy measure which will change following the pattern to be classified (data dependent). This method uses the neural network which has good study ability. Our experiment results show that this method make the classification accurate improve.
Coherent point drift(CPD), a sophistic non-rigid point sets registration method, is successfully applied in computer vision, medical image analysis, to name a few. Its registration error, however, is affected greatly ...
详细信息
Coherent point drift(CPD), a sophistic non-rigid point sets registration method, is successfully applied in computer vision, medical image analysis, to name a few. Its registration error, however, is affected greatly by three free parameters. One of them, width parameter of Gaussian kernel function, is studied in this paper to tune registration error of the CPD method. Before computing width parameter by a heuristic algorithm, the given data is regulated using minmax and standard normalization in effort to remove heterogeneity among the features of the data. Several experiments are designed on the availab.e six datasets to examine the effectiveness of CPD based on the refined width parameter. Experimental comparison indicated that the refined width parameter from the normalized data can reduce registration error of the CPD method, e.g., by 13.76% on bat dataset.
Fuzzy measure and integral are widely used in Multiple Classifier System (MCS). But the number of coefficients involved in the fuzzy integral model grows exponentially with the number of classifiers to be aggregated. ...
详细信息
Fuzzy measure and integral are widely used in Multiple Classifier System (MCS). But the number of coefficients involved in the fuzzy integral model grows exponentially with the number of classifiers to be aggregated. The main difficulty is to identify all these coefficients. This paper does an attempt Using 2-additrve fuzzy measure in Multiple Classifier System. Our conclusion is that when different interactions exist in different classifiers the complexity of the computation can be significantly reduced by 2-order additive measure.A simple example is included to illustrate the 2-order additive measure.
This paper is to discuss the reduction of computation complexity in decision tree generation for the numerical-valued attributes. The proposed method is based on the partition impurity. The partition impurity minimiza...
详细信息
This paper is to discuss the reduction of computation complexity in decision tree generation for the numerical-valued attributes. The proposed method is based on the partition impurity. The partition impurity minimization is used to select the expanded attribute for generation the sub-node during the tree growth. After inducing the unstable cut-points of numerical-attributes, it is analytically proved that the partition impurity minimization can always be obtained at the unstable cut-points. It implies that the computation on stable cut-points may not be considered during the tree growth. Since the stable cut-points are far more than unstable cut-points, the experimental results show that the proposed method can reduce the computational complexity greatly.
A core set extreme learningmachine(CSELM) approach is proposed in order to deal with large datasets classification problem. In the first stage, the core set can be obtained efficiently by using the generalized core v...
详细信息
A core set extreme learningmachine(CSELM) approach is proposed in order to deal with large datasets classification problem. In the first stage, the core set can be obtained efficiently by using the generalized core vector machine(GCVM) algorithm. For the second stage, the extreme learningmachine(ELM) can be used to implement classification for much larger datasets. Experiments show that the CSELM has comparable performance with SVM and ELM implementations, but is faster on large datasets.
暂无评论