咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Dynamic data summarization for... 收藏
arXiv

Dynamic data summarization for hierarchical spatial clustering

作     者:Abduaziz, Kayumov Kim, Min Sik Shin, Ji Sun 

作者机构:Department of Computer and Information Security and Convergence Engineering for Intelligent Drones Sejong University Korea Republic of Petabi Inc. Department of Computer and Information Security Sejong University Korea Republic of 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Hierarchical clustering 

摘      要:Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) finds meaningful patterns in spatial data by considering density and spatial proximity. As the clustering algorithm is inherently designed for static applications, so have recent studies focused on accelerating the algorithm for static applications using approximate or parallel methods. However, much less attention has been given to dynamic environments, where even a single point insertion or deletion can require recomputing the clustering hierarchy from scratch due to the need of maintaining the minimum spanning tree (MST) over a complete graph. This paper addresses the challenge of enhancing the clustering algorithm for dynamic data. We present an exact algorithm that maintains density information and updates the clustering hierarchy of HDBSCAN during point insertions and deletions. Considering the hardness of adapting the exact algorithm to dynamic data involving modern workloads, we propose an online-offline framework. The online component efficiently summarizes dynamic data using a tree structure, called Bubble-tree, while the offline step performs the static clustering. Experimental results demonstrate that the data summarization adapts well to fully dynamic environments, providing compression quality on par with existing techniques while significantly improving runtime performance of the clustering algorithm in dynamic data workloads. © 2024, CC BY-NC-ND.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分