The roughk-means clustering algorithm has a strong ability to deal with data with uncertain boundaries. However, this algorithm also has limitations such as sensitivity to initial data selection, as well as it use of...
详细信息
The roughk-means clustering algorithm has a strong ability to deal with data with uncertain boundaries. However, this algorithm also has limitations such as sensitivity to initial data selection, as well as it use of fixed weights and thresholds, which results in unstable clustering results and decreased accuracy. In response to this problem, combined with the firefly algorithm, the original algorithm has been improved from three aspects. Firstly, based on the ratio of the number of objects in the dataset to the product of the difference of the objects in the dataset, a more reasonable method of dynamically adjusting the weights of approximation and boundary set is designed. Secondly, a method of adaptively realising the threshold associated with the number of iterations is given. Then, by constructing a new objective function, and take the objective function value as the firefly brightness intensity to perform the search and update iteration of the initial cluster centre point, the optimal solution obtained by each iteration of firefly is taken as the initial centre position of the algorithm. Experiment result shows that the new algorithm has improved the clustering effect.
Internet is a rich and potential information base. It needs scientific and effective methods in order to find interesting information. Researchers have proposed many web clustering algorithms, but it spends too much t...
详细信息
ISBN:
(纸本)9783642239816
Internet is a rich and potential information base. It needs scientific and effective methods in order to find interesting information. Researchers have proposed many web clustering algorithms, but it spends too much time using a simple kind of clustering algorithms, because the number of the web information is huge. Considering the efficiency and the effect of the clustering, in the paper, we use a two-layer web clustering approach to cluster for a number of web access patterns from web logs. At the first layer, we use the LVQ (Learning Vector Quantization) neural network to group the web access patterns to several representative clustering centers. At the second layer, the rough k-means algorithm is adopted to deal with the result of the first layer, producing the final classifications. The experimental results show that the effect is close to monolayer clustering algorithm the roughk-means, and the efficiency is better than the roughk-means by using the two-layer web clustering approach.
暂无评论