The present paper considers the problem of partitioning a dataset into a known number of clusters using the sum of squared errors criterion (SSE). A new clustering method, called DE-kM, which combines differential evo...
详细信息
The present paper considers the problem of partitioning a dataset into a known number of clusters using the sum of squared errors criterion (SSE). A new clustering method, called DE-kM, which combines differential evolution algorithm (DE) with the well known k-means procedure is described. In the method, the k-means algorithm is used to fine-tune each candidate solution obtained by mutation and crossover operators of DE. Additionally, a reordering procedure which allows the evolutionary algorithm to tackle the redundant representation problem is proposed. The performance of the DE-kM clustering method is compared to the performance of differential evolution, global k-means method, genetic k-means algorithm and two variants of the k-means algorithm. The experimental results show that if the number of clusters k is sufficiently large, DE-kM obtains solutions with lower SSE values than the other five algorithms. (C) 2011 Elsevier B.V. All rights reserved.
Data mining is a powerful new technology to extract hidden information from data warehouses. Data mining analyzes data from different perspectives and finds useful patterns and knowledge from large volumes of raw data...
详细信息
Data mining is a powerful new technology to extract hidden information from data warehouses. Data mining analyzes data from different perspectives and finds useful patterns and knowledge from large volumes of raw data. Clustering is one of the main methods of data mining. k-means algorithm is one of the most common clustering algorithms due to its efficiency and ease of use. One of the challenges of clustering is to identify the appropriate label for each cluster. The selection of a label is done so as to provide a proper description of cluster records. In some cases, choosing an appropriate label is not easy due to the results and structure of each cluster. The aim of this study is to present an algorithm based on the k-means clustering in order to facilitate the allocation of labels to each cluster. (C) 2020 Sharif University of Technology. All rights reserved.
In order to overcome the problems of invulnerability and low communication efficiency when analyzing network communication instability with current methods, this paper proposes a modeling method of network communicati...
详细信息
In order to overcome the problems of invulnerability and low communication efficiency when analyzing network communication instability with current methods, this paper proposes a modeling method of network communication instability based on k-means algorithm. The network element nodes are generated by clustering idea, and the initial communication topology is constructed. k-means algorithm is used to optimize the initial communication model, build a comprehensive mathematical model of network communication, and solve the model to realize the optimization of communication model. The network efficiency function is used to further quantify the network invulnerability, and the function is used to find the most vulnerable nodes in the network, and strengthen them to achieve efficient control of network invulnerability. The experimental results show that the model has strong invulnerability, up to 99.9%, high communication efficiency and coverage, and the maximum communication delay is only 0.35 s. It is a feasible network communication model.
The IoT and Artificial intelligence, the amount of information generated on the Web site is increasing. The rise of the Hadoop distributed cloud computing platform (HDCCP) makes it possible to use multiple computing n...
详细信息
The IoT and Artificial intelligence, the amount of information generated on the Web site is increasing. The rise of the Hadoop distributed cloud computing platform (HDCCP) makes it possible to use multiple computing nodes for parallel computing to solve the performance problems of traditional serial algorithms. The purpose of this paper is to study data design based on cloud computing and improved k-means algorithm (kMA). This paper deeply researches Hadoop distributed cloud computing platform and clustering algorithm and other related technologies, and designs and implements a cluster analysis system (CAS) based on HP. And through an in-depth analysis of the problems existing in the kMA, an improved scheme based on the HDP is designed. The experimental environment was conFig.d with the cluster analysis system implemented. Finally, the improved kMPA was tested experimentally from four directions: convergence speed, acceleration ratio, initialization sampling rate, and accuracy rate. We can see the experimental results that the CAS based on the HDCCP designed in this paper can provide efficient and configurable cluster analysis services. In this paper, the correct rate is 90.7%.
In this paper, a weight selection procedure in the W-k-means algorithm is proposed based on the statistical variation viewpoint. This approach can solve the W-k-means algorithm's problem that the clustering qualit...
详细信息
In this paper, a weight selection procedure in the W-k-means algorithm is proposed based on the statistical variation viewpoint. This approach can solve the W-k-means algorithm's problem that the clustering quality is greatly affected by the initial value of weight. After the statistics of data, the weights of data are designed to provide more information for the character of W-k-means algorithm so as to improve the precision. Furthermore, the corresponding computational complexity is analyzed as well. We compare the clustering results of the W-k-means algorithm with the different initialization methods. Results from color image segmentation illustrate that the proposed procedure produces better segmentation than the random initialization according to Liu and Yang's (1994) evaluation function. (C) 2011 Elsevier Ltd. All rights reserved.
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. ...
详细信息
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms. (C) 2010 Elsevier Ltd. All rights reserved.
Block truncation coding (BTC) is an efficient image compression method which finds application in diverse fields. The basic problem can be viewed as a two-level quantization process. However, efficient ways for optima...
详细信息
Block truncation coding (BTC) is an efficient image compression method which finds application in diverse fields. The basic problem can be viewed as a two-level quantization process. However, efficient ways for optimal threshold determination have not been discovered so far. We propose a fast BTC algorithm based on a truncated k-means algorithm, utilizing the image inter-block correlation and the fact that k-means algorithm converges very fast in this case. This produces near-optimum solution with significantly improved speed over other methods. Simulation results confirm such advantages. (C) 2010 Elsevier GmbH. All rights reserved.
Recently a modified k-means algorithm for vector quantization design has been proposed where the codevector updating step is as follows: new codevector = current codevector + scale factor (new centroid - current codev...
详细信息
Recently a modified k-means algorithm for vector quantization design has been proposed where the codevector updating step is as follows: new codevector = current codevector + scale factor (new centroid - current codevector), This algorithm uses a fixed value for the scale factor. In this paper, we propose the use of a variable scale factor which is a function of the iteration number. For the vector quantization of image data, we show that it offers faster convergence than the modified k-means algorithm with a fixed scale factor, without affecting the optimality of the codebook.
Conventional algorithms fail to obtain satisfactory background segmentation results for underwater images. In this study, an improved k-means algorithm was developed for underwater image background segmentation to add...
详细信息
Conventional algorithms fail to obtain satisfactory background segmentation results for underwater images. In this study, an improved k-means algorithm was developed for underwater image background segmentation to address the issue of improper k value determination and minimize the impact of initial centroid position of grayscale image during the gray level quantization of the conventional k-means algorithm. A total of 100 underwater images taken by an underwater robot were sampled to test the aforementioned algorithm in respect of background segmentation validity and time cost. The k value and initial centroid position of grayscale image were optimized. The results were compared to the other three existing algorithms, including the conventional k-means algorithm, the improved Otsu algorithm, and the Canny operator edge extraction method. The experimental results showed that the improved k-means underwater background segmentation algorithm could effectively segment the background of underwater images with a low color cast, low contrast, and blurred edges. Although its cost in time was higher than that of the other three algorithms, it none the less proved more efficient than the time-consuming manual segmentation method. The algorithm proposed in this paper could potentially be used in underwater environments for underwater background segmentation.
In order to overcome the low accuracy of the traditional method, a fast identification method based on the improved k-mean algorithm is proposed. Spatial grid block model is constructed to extract the fingerprint text...
详细信息
In order to overcome the low accuracy of the traditional method, a fast identification method based on the improved k-mean algorithm is proposed. Spatial grid block model is constructed to extract the fingerprint texture features and then the fingerprint profile features are detected using the edge outline extraction method. The kalman fusion method is used to reconstruct fingerprint information. Using the neighbourhood distributed retrieval method, fingerprint image feature fusion is realised and the texture feature extraction model for forged fingerprints is established. The k-means clustering method is used for fingerprint feature clustering to realise fast identification of forged fingerprints. Experimental results show that the identification accuracy of this method is higher than 0.85, and the identification stability is good. The signal-to-noise ratio of fingerprint images is always between 25.3 dB and 82.3 dB, and the imaging quality is high, indicating that this method can realise fast and accurate identification of forged fingerprints.
暂无评论