As a density-based clustering algorithm, DBSCAN plays an important role in data mining. Normally DBSCAN algorithm is computationally expensive, limiting its performance in large scale data sets, especially in high dim...
详细信息
ISBN:
(纸本)9781424435197
As a density-based clustering algorithm, DBSCAN plays an important role in data mining. Normally DBSCAN algorithm is computationally expensive, limiting its performance in large scale data sets, especially in high dimensional data sets. The high complexity is rooted from the region queries, a very common operation in density-based algorithms, which brings the complexity of the algorithms to O(n(2)), where a is the number of database objects. With the help of index structure the complexity can be reduced to O(nlogn), however it is inefficient to create the index structureespecially for high dimensional data sets or large scale databases. In this paper we propose a new concept named memory effect (ME). ME can be used to shrink the scope of region queries to neighboring objects. Based on ME we have improved DBSCAN algorithm evidently, and empirical experiments have shown the improvement in both effectiveness and efficiency. At last, we give the theoretical analysis of medbscan algorithm and talk about the influence of parameters.
暂无评论