检索结果-内蒙古大学图书馆

A clustering algorithm for stream data with LDA-based unsupervised localized dimension reduction

INFORMATION SCIENCES 2017年 381卷 104-123页

作者： Laohakiat, Sirisup Phimoltares, Suphakant Lursinsap, Chidchanok Chulalongkorn Univ Dept Math & Comp Sci Bangkok Thailand

We present an algorithm for clustering high dimensional streaming data. The algorithm incorporates dimension reduction into the stream clustering framework. When a new datum arrives, the algorithm performs dimension reduction to find a local projected subspace using unsupervised LDA (Linear Discriminant Analysis)-based method. The obtained local subspace would maximally separate the nearby micro-clusters with respect to the incoming point. Then, the incoming point is assigned to a micro-cluster in the projected space, rather than in the full dimensional space. The experimental results show that the proposed algorithm outperforms its counterpart streaming clustering algorithms. Moreover, when compared with traditional clustering algorithms which require the whole data set, the proposed algorithms shows comparable clustering performances with much less computation time for large data sets. (C) 2016 Elsevier Inc. All rights reserved.

关键词： stream data clustering Dimension reduction Linear discriminant analysis clustering algorithm Linear discriminant analysis subspace

来源：评论

学校读者我要写书评

暂无评论

Incremental clustering of dynamic data streams using connectivity based representative points

引用

data & KNOWLEDGE ENGINEERING 2009年第1期68卷 1-27页

作者： Luehr, Sebastian Lazarescu, Mihai Curtin Univ Technol Dept Comp Bentley WA 6102 Australia

We present an incremental graph-based clustering algorithm whose design was motivated by a need to extract and retain meaningful information from data streams produced by applications such as large scale surveillance, network packet inspection and financial transaction monitoring. To this end, the method we propose utilises representative points to both incrementally cluster new data and to selectively retain important cluster information within a knowledge repository. The repository can then be subsequently used to assist in the processing of new data, the archival of critical features for off-line analysis, and in the identification of recurrent patterns. Crown Copyright (C) 2008 Published by Elsevier B.V. All rights reserved.

关键词： data mining Incremental graph-based clustering stream data clustering Recurrent change Knowledge acquisition

来源：评论

学校读者我要写书评

暂无评论

An incremental density-based clustering framework using fuzzy local clustering

引用

INFORMATION SCIENCES 2021年 547卷 404-426页

作者： Laohakiat, Sirisup Sa-ing, Vera Srinakharinwirot Univ Fac Sci Dept Comp Sci Bangkok Thailand

This paper presents a novel incremental density-based clustering framework using the one-pass scheme, named Fuzzy Incremental Density-based clustering (FIDC). Employing one-pass clustering in which each data point is processed once and discarded, FIDC can process large datasets with less computation time and memory, compared to its density-based clustering counterparts. Fuzzy local clustering is employed in local clusters assignment process to reduce clustering inconsistencies from one-pass clustering. To improve the clustering performance and simplify the parameter choosing process, the modified valley seeking algorithm is used to adaptively determine the outlier thresholds for generating the final clusters. FIDC can operate in both traditional and stream data clustering. The experimental results show that FIDC outperforms state-of-the-art algorithms in both clustering modes. (C) 2020 Elsevier Inc. All rights reserved.

关键词： Incremental clustering Density-based clustering Fuzzy clustering stream data clustering

来源：评论

学校读者我要写书评

暂无评论

Incremental Cluster Updating Using Gaussian Mixture Model 28th

Incremental Cluster Updating Using Gaussian Mixture Model

引用

28th Canadian Conference on Artificial Intelligence (Canadian AI)

作者： Bigdeli, Elnaz Mohammadi, Mandi Raahemi, Bijan Matwin, Stan Univ Ottawa Sch Elect Engn & Comp Sci Ottawa ON Canada Univ Ottawa Telfer Sch Management Knowledge Discovery & Data Min Lab Ottawa ON K1H 8M5 Canada Dalhousie Univ Dept Comp Halifax NS Canada Polish Acad Sci Inst Comp Sci PL-00901 Warsaw Poland

ISBN: (纸本)9783319183565;9783319183558

In this paper, we present a new approach for updating clusters incrementally. The proposed incremental approach preserves comprehensive statistical information of the clusters in form of Gaussian Mixture Models (GMM). As each GMM needs the number of Gaussian (component) as an input parameter, we proposed a method to determine the number of components automatically with introducing the concept of core points. In the updating phase, instead of processing each new sample individually, we collect the new incoming samples and cluster them. By employing the concepts of core points and GMMs, we build a number of GMMs for the new samples and we label the new GMMs based on their similarity to the already existing GMMs. To find the similarity among GMMs, we introduce a new modified version of Kullback-Leibler as a distance function. For merging the current GMMs and the new GMMs, we proposed a new merging mechanism in which the closest components in both GMMs are merged to create a new GMM. Since GMM structure is a compact representation of clusters, there is no increase in the time neither in clustering side nor in updating phase. We measured the accuracy of clusters based on different clustering validity metrics (DB, Dunn, SD and purity) and the results show that our algorithm outperforms other incremental clustering algorithms in terms of quality of the final clusters.

关键词： Incremental clustering Gaussian Mixture Model stream data clustering

来源：评论

学校读者我要写书评

暂无评论

A streaming data clustering Method Based on Dual Strategies Improved DENCLUE

引用

IEEE ACCESS 2024年 12卷 153709-153726页

作者： Cai, Ting Lv, Jiazhi Ye, Zhiwei Li, Xiang Zhou, Wen Kochan, Orest Hubei Univ Technol Sch Comp Sci Wuhan 430068 Peoples R China Lviv Polytech Natl Univ Dept Measuring Informat Technol UK-79013 Lvov Ukraine

streaming data arrives continually and is characterized by fast, massive, dynamic evolution and instability. Different from traditional static data clustering, streaming data clustering algorithms need to consider concept drift, outlier handling, identification and updating of dynamic clustering patterns, etc. DENCLUE is one of the most classical algorithms, which adopts nonparametric estimation and utilizes a finite number of samples to make inferences, to get the distribution of the overall data. However, the basic DENCLUE algorithm suffers from the problem that the Kernel Density Estimation (KDE) window width and density threshold parameter are difficult to choose, which cannot be directly applied to streaming data clustering. Therefore, in this paper, we propose a dual strategies improved DENCLUE streaming data clustering method based on KDE optimization and two-stage clustering, which takes into account the concept drift problem in streaming data. Firstly, a density threshold parameter optimization method based on KDE is proposed to address the challenges associated with selecting the KDE window width and density threshold in the traditional DENCLUE algorithm. Secondly, a two-stage clustering and merging method is designed to improve the performance of traditional DENCLUE clustering. The experimental results show that our algorithm outperforms the traditional Clustream and Denstream algorithms on datasets with arbitrary shapes and sizes, and has good performance on streaming data clustering.

关键词： clustering algorithms streams Concept drift clustering methods Partitioning algorithms Kernel Heuristic algorithms Estimation Inference algorithms DENCLUE stream data clustering kernel density estimation two-stage clustering

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：