版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Siemens AG D-81730 Munich Germany Univ Munich Inst Comp Sci D-80538 Munich Germany
出 版 物:《DATA MINING AND KNOWLEDGE DISCOVERY》 (数据开发与认知杂志)
年 卷 期:1999年第3卷第3期
页 面:263-290页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:clustering algorithms parallel algorithms distributed algorithms scalable data mining distributed index structures spatial databases
摘 要:The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we present PDBSCAN, a parallel version of this algorithm. We use the shared-nothing architecture with multiple computers interconnected through a network. A fundamental component of a shared-nothing system is its distributed data structure. We introduce the dR*-tree, a distributed spatial index structure in which the data is spread among multiple computers and the indexes of the data are replicated on every computer. We implemented our method using a number of workstations connected via Ethernet (10 Mbit). A performance evaluation shows that PDBSCAN offers nearly linear speedup and has excellent scaleup and sizeup behavior.