clustering algorithm for stream data, as one of streamdata mining technologies, has extensive applications on network traffic analysis, telecommunication, planetary remote sensing, web site analysis, etc. clustering ...
详细信息
ISBN:
(纸本)9781538608401
clustering algorithm for stream data, as one of streamdata mining technologies, has extensive applications on network traffic analysis, telecommunication, planetary remote sensing, web site analysis, etc. clustering algorithm for stream data has a high demand for real-time processing, but current clusteringalgorithms for streamdata, such as Clustream, Dstream, are all based on sequential algorithms, which are unable to meet the realtime requirement. In this paper, we propose a multi-grid based clustering algorithm for stream data. The algorithm partitions the grid space appropriately on the basis of conventional grid-based DBSCAN clusteringalgorithm, which can effectively limit the searching scope of grid neighbours to accelerate processing performance. Meanwhile, we utilize CUDA to conduct parallel computing in order to further speed up processing. Through the experiments tested on the KDDCUP-99 open testing dataset, it shows that the processing speed of the algorithm proposed by the paper is 10 times faster than that of the conventional grid-based algorithm and moreover the CUDA based algorithm can achieve an speedup of 3 compared with the algorithm executed on CPU.
暂无评论