This paper proposed a novel genetic algorithm (GA) based k-means algorithm to perform cluster analysis. In the proposed approach, the population of GA is initialized by k-means algorithm. Then, the GA operators are ap...
详细信息
ISBN:
(纸本)9783319746906;9783319746890
This paper proposed a novel genetic algorithm (GA) based k-means algorithm to perform cluster analysis. In the proposed approach, the population of GA is initialized by k-means algorithm. Then, the GA operators are applied to generate a new population. In addition, new mutation is proposed depending on the extreme points of clustering. The proposed approach is applied on a set of test problems. The results proved the superiority of the new methodology to perform cluster analysis well.
Over the past few decades clustering algorithms have been used in diversified fields of engineering and science. Out of various methods, k-means algorithm is one of the most popular clustering algorithms. However, k-M...
详细信息
ISBN:
(纸本)9788132222026;9788132222019
Over the past few decades clustering algorithms have been used in diversified fields of engineering and science. Out of various methods, k-means algorithm is one of the most popular clustering algorithms. However, k-means algorithm has a major drawback of trapping to local optima. Motivated by this, this paper attempts to hybridize Chemical Reaction Optimization (CRO) algorithm with k-means algorithm for data clustering. In this method k-means algorithm is used as an on-wall ineffective collision reaction in the CRO algorithm, thereby enjoying the intensification property of k-means algorithm and diversification of intermolecular reactions of CRO algorithm. The performance of the proposed methodology is evaluated by comparing the obtained results on four real world datasets with three other algorithms including k-means algorithm, CRO-based and differential evolution (DE) based clustering algorithm. Experimental result shows that the performance of proposed clustering algorithm is better than k-means, DE-based, CRO-based clustering algorithm on the datasets considered.
The objective of traditional k-means algorithm is to make the distances of objects in the same cluster as small as possible, but another objective that the distances of objects from different clusters is not taken int...
详细信息
ISBN:
(纸本)9781424421138
The objective of traditional k-means algorithm is to make the distances of objects in the same cluster as small as possible, but another objective that the distances of objects from different clusters is not taken into account. This paper presents an improved k-means algorithm satisfying both of objectives above. We modify the cost function of entropy weighting k-means clustering algorithm by adding a variable that is relevant linearly to the square sum of distances from the mean of all objects and the means of all dusters. The improved k-means clustering algorithm is presented and the effectiveness of the algorithm is demonstrated by comparing the results with other k-means clustering algorithms on Iris data.
This paper introduces a new algorithm for clustering sequential data. The SkM algorithm is a k-means-type algorithm suited for identifying groups of objects with similar trajectories and dynamics. We provide a simulat...
详细信息
ISBN:
(纸本)9783540883081
This paper introduces a new algorithm for clustering sequential data. The SkM algorithm is a k-means-type algorithm suited for identifying groups of objects with similar trajectories and dynamics. We provide a simulation study to show the good properties of the SkM algorithm. Moreover, a real application to website users' search patterns shows its usefulness in identifying groups with heterogeneous behavior. We identify two distinct clusters with different styles of website search.
The initial clustering centers of traditional k-means algorithm are randomly generated from a data set,clustering effect is not very stable. Aimed at this problem, this paper puts forward a kind of op
ISBN:
(纸本)9781467389808
The initial clustering centers of traditional k-means algorithm are randomly generated from a data set,clustering effect is not very stable. Aimed at this problem, this paper puts forward a kind of op
The RFM model used for customer segmentation in the traditional retail industry is not suitable for the industry with distinct attributes of social groups, so the RFMC model is created by introducing the parameter C o...
详细信息
ISBN:
(纸本)9781728160429
The RFM model used for customer segmentation in the traditional retail industry is not suitable for the industry with distinct attributes of social groups, so the RFMC model is created by introducing the parameter C of social relations. Educational e-commerce enterprise M is selected for empirical study, and k-means algorithm is used for cluster analysis of valid customers of enterprise M, which resulted in 5 distinct customer groups and verified the effectiveness of the model.
In this paper, we propose a combination of k-means algorithm and Particle Swarm Optimization (PSO) method. The k-means algorithm is utilized for data clustering. On one hand, the number of clusters (k) should be deter...
详细信息
ISBN:
(纸本)9798350314557
In this paper, we propose a combination of k-means algorithm and Particle Swarm Optimization (PSO) method. The k-means algorithm is utilized for data clustering. On one hand, the number of clusters (k) should be determined by expert or found by try-and-error procedure in the k-means algorithm. On the other hand, initial centroids and number of clusters (k) are influenced on the quality of resulted grouping. Therefore, the aim of the proposed procedure is using PSO and the Structural Similarity Index (SSIM) criterion as a fitness function in order to find the best value for k parameter and better initial clusters' center. Due to different value of k parameter, the number of initial centroids which should be produced is variant. Thus, length of particles in PSO method may be different in each iteration. Experimental results show the superiority of this approach in comparison with standard k-means algorithm and both of them are evaluated on image segmentation problem.
Data Stream mining has gained attraction from many researchers as there is need to mine large dataset which pose different challenges for researchers. Stream data is different compared to normal data as they are conti...
详细信息
ISBN:
(纸本)9781467369114
Data Stream mining has gained attraction from many researchers as there is need to mine large dataset which pose different challenges for researchers. Stream data is different compared to normal data as they are continuously produced from different applications which impose different challenges like massive, infinite, concept drift for processing. An object that does not obey the behavior of normal data object is called outliers. Outlier detection is used in different applications like fraud detection, intrusion detection, track environmental changes, medical diagnosis so there is need to detect outliers from data streams. Various approaches are used for outlier detection. Some of them use k-means algorithm for outlier detection in data streams which help to create a similar group or cluster of data points. Data stream clustering techniques are highly helpful to cluster similar data items in data streams and also to detect the outliers from them, so they are called cluster based outlier detection. k-means algorithm is partition based algorithm which is used for clustering datasets into number of clusters. It is most common and popular algorithm for clustering due to its simplicity and efficiency. Purpose of this paper is to review of different approaches of outlier detection which is used for k-means algorithm for clustering dataset with some other methods. Different application areas of outlier detection are discussed in this paper.
In this paper, we propose a new image segmentation method based on k-means algorithm and edge information, and identification of artifacts in scenery images using color and line information by Radial Basis Function (R...
详细信息
ISBN:
(纸本)9781457706530
In this paper, we propose a new image segmentation method based on k-means algorithm and edge information, and identification of artifacts in scenery images using color and line information by Radial Basis Function (RBF) network is realized. In the proposed method, the original image is divided into some areas based on color information by the k-means algorithm and edge information, and whether or not artifacts are included for each divided area are judged using color and line information by the RBF network. We carried out a series of computer experiments and confirmed that the proposed method can identify the area including artifacts with considerable accuracy.
A class decomposition is one of the possible solutions and the most important factors of success for the improvement of classification performance. The idea is to transform a dataset by categorizing each class label i...
详细信息
ISBN:
(纸本)9781728188553
A class decomposition is one of the possible solutions and the most important factors of success for the improvement of classification performance. The idea is to transform a dataset by categorizing each class label into groups or clusters. Thus, the transformation is done concerning data characteristics and similarities. This paper proposed a hybrid model for a class decomposition by the integration of gap statistic, k-means clustering algorithm, and Naive Bayes classifier. The model is based on clustering validity using gap statistic for enhancing the classifier performance. The model works by dividing each dataset into several subsets regarding its class labels. After that, the clustering validity using gap statistic is employed for estimating the optimal number of clusters for each subset that belong to a particular class label. The estimated number of clusters is used then as an input parameter for the k-means clustering algorithm for relabeling the data objects with a new class label in each subset. Every data object is allocated to each of the clusters generated by the k-means clustering algorithm, which consider it as the new class label. The proposed model integrates the class decomposition approach with Naive Bayes classifier to compare the performance of the proposed model under several classification measures. The model is validated and evaluated by employing different real-world datasets collected from the UCI machine learning repository. The experimental results show that a significant improvement in classification accuracy and F-measure when the class decomposition is applied. Also, the experiments indicate that using a class decomposition is not appropriate for all datasets.
暂无评论