检索结果-内蒙古大学图书馆

A New Density Peak Clustering algorithm With Adaptive Clustering Center Based on Differential Privacy

IEEE ACCESS 2023年 11卷 1418-1431页

作者： Chen, Hua Zhou, Yuan Mei, Kehui Wang, Nan Cai, Guangxing Hubei Univ Technol Sch Sci Wuhan 430068 Peoples R China

A new density peak clustering (dpc) algorithm with adaptive clustering center based on differential privacy was proposed to solve the problems of poor adaptability of high-dimensional data, inability to automatically determine clustering centers, and privacy problems in clustering analysis. First, to solve the problem of poor adaptability of high-dimensional data, cosine distance was used to measure the similarity between high-dimensional datasets. Then, aiming at the subjective problem of clustering center selection, from the perspective of ranking graph, the weight (i - 1)/i was introduced creatively, the slope trend of ranking graph was redefined to realize the adaptive clustering center. Finally, aiming at the privacy problem, the Laplacian noise of appropriate privacy budget was added to the core statistic (local density) of the algorithm to achieve the balance between privacy protection and algorithm effectiveness. Experimental results on both the synthetic and UCI datasets show that this algorithm can not only realize the automatic selection of clustering center, but also solve the privacy problem in clustering analysis, and improve the clustering evaluation index greatly, which proves the effectiveness of the algorithm.

关键词： Clustering algorithms Partitioning algorithms Privacy Noise measurement Laplace equations Differential privacy Statistical analysis Cosine distance differential privacy dpc algorithm Laplacian noise trend of slope change

来源：评论

学校读者我要写书评

暂无评论

A fast density peaks clustering algorithm with sparse search

引用

INFORMATION SCIENCES 2021年 554卷 61-83页

作者： Xu, Xiao Ding, Shifei Wang, Yanru Wang, Lijuan Jia, Weikuan China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Jiangsu Peoples R China Minstry Educ Peoples Republ China Mine Digitizat Engn Res Ctr Xuzhou 221116 Jiangsu Peoples R China Shandong Normal Univ Sch Informat Sci & Engn Jinan 250358 Peoples R China

Given a large unlabeled set of complex data, how to efficiently and effectively group them into clusters remains a challenging problem. Density peaks clustering (dpc) algorithm is an emerging algorithm, which identifies cluster centers based on a decision graph. Without setting the number of cluster centers, dpc can effectively recognize the clusters. However, the similarity between every two data points must be calculated to construct a decision graph, which results in high computational complexity. To overcome this issue, we propose a fast sparse search density peaks clustering (FSdpc) algorithm to enhance the dpc, which constructs a decision graph with fewer similarity calculations to identify cluster centers quickly. In FSdpc, we design a novel sparse search strategy to measure the similarity between the nearest neighbors of each data points. Therefore, FSdpc can enhance the efficiency of the dpc while maintaining satisfactory results. We also propose a novel random third-party data point method to search the nearest neighbors, which introduces no additional parameters or high computational complexity. The experimental results on synthetic datasets and real-world datasets indicate that the proposed algorithm consistently outperforms the dpc and other state-of-the-art algorithms. (C) 2020 Elsevier Inc. All rights reserved.

关键词： dpc algorithm Computational complexity Sparse search strategy Fewer distance calculations Similarity matrix

来源：评论

学校读者我要写书评

暂无评论

A robust density peaks clustering algorithm with density-sensitive similarity

引用

KNOWLEDGE-BASED SYSTEMS 2020年 200卷 106028-106028页

作者： Xu, Xiao Ding, Shifei Wang, Lijuan Wang, Yanru China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Jiangsu Peoples R China Minist Educ Peoples Republ China Mine Digitizat Engn Res Ctr Xuzhou 221116 Jiangsu Peoples R China Xu Zhou Coll Ind Technol Sch Informat & Elect Engn Xuzhou 221400 Jiangsu Peoples R China

Density peaks clustering (dpc) algorithm is proposed to identify the cluster centers quickly by drawing a decision-graph without any prior knowledge. Meanwhile, dpc obtains arbitrary clusters with fewer parameters and no iteration. However, dpc has some shortcomings to be addressed before it is widely applied. Firstly, dpc is not suitable for manifold datasets because these datasets have multiple density peaks in one cluster. Secondly, the cut-off distance parameter has a great influence on the algorithm, especially on small-scale datasets. Thirdly, the method of decision-graph will cause uncertain cluster centers, which leads to wrong clustering. To address these issues, we propose a robust density peaks clustering algorithm with density-sensitive similarity (Rdpc-DSS) to find accurate cluster centers on the manifold datasets. With density-sensitive similarity, the influence of the parameters on the clustering results is reduced. In addition, a novel density clustering index (DCI) instead of the decision-graph is designed to automatically determine the number of cluster centers. Extensive experimental results show that Rdpc-DSS outperforms dpc and other state-of-the-art algorithms on the manifold datasets. (C) 2020 Elsevier B.V. All rights reserved.

关键词： dpc algorithm Density-sensitive similarity Automatic clustering Clustering validity index

来源：评论

学校读者我要写书评

暂无评论

A feasible density peaks clustering algorithm with a merging strategy

引用

SOFT COMPUTING 2019年第13期23卷 5171-5183页

作者： Xu, Xiao Ding, Shifei Xu, Hui Liao, Hongmei Xue, Yu China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Jiangsu Peoples R China Nanjing Univ Informat Sci & Technol Sch Comp & Software Nanjing 210044 Jiangsu Peoples R China

Density peaks clustering (dpc) algorithm is a novel algorithm that efficiently deals with the complex structure of the data sets by finding the density peaks. It needs neither iterative process nor more parameters. The density-distance is utilized to find the density peaks in the dpc algorithm. But unfortunately, it will divide one cluster into multiple clusters if there are multiple density peaks in one cluster and ineffective when data sets have relatively higher dimensions. To overcome the first problem, we propose a Fdpc algorithm based on a novel merging strategy motivated by support vector machine. First, the strategy utilizes the support vectors to calculate the feedback values between every two clusters after clustering based on the dpc. Then, it merges clusters to obtain accurate clustering results in a recursive way according to the feedback values. To address the second limitation, we introduce nonnegative matrix factorization into the Fdpc to preprocess high-dimensional data sets before clustering. The experimental results on real-world data sets and artificial data sets demonstrate that our algorithm is robust and flexible and can recognize arbitrary shapes of the clusters effectively regardless of the space dimension and outperforms dpc.

关键词： Fdpc algorithm Merging strategy dpc algorithm Nonnegative matrix factorization (NMF) Support vector machine (SVM)

来源：评论

学校读者我要写书评

暂无评论

dpcG: an efficient density peaks clustering algorithm based on grid

引用

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS 2018年第5期9卷 743-754页

作者： Xu, Xiao Ding, Shifei Du, Mingjing Xue, Yu China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Peoples R China Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China Nanjing Univ Informat Sci & Technol Sch Comp & Software Nanjing 210044 Jiangsu Peoples R China

To deal with the complex structure of the data set, density peaks clustering algorithm (dpc) was proposed in 2014. The density and the delta-distance are utilized to find the clustering centers in the dpc method. It detects outliers efficiently and finds clusters of arbitrary shape. But unfortunately, we need to calculate the distance between all data points in the first process, which limits the running speed of dpc algorithm on large datasets. To address this issue, this paper introduces a novel approach based on grid, called density peaks clustering algorithm based on grid (dpcG). This approach can overcome the operation efficiency problem. When calculating the local density, the idea of the grid is introduced to reduce the computation time based on the dpc algorithm. Neither it requires calculating all the distances nor much input parameters. Moreover, dpcG algorithm successfully inherits the all merits of the dpc algorithm. Experimental results on UCI data sets and artificial data show that the dpcG algorithm is flexible and effective.

关键词： Cluster analysis dpc algorithm CLIQUE algorithm dpcG algorithm Operational efficiency

来源：评论

学校读者我要写书评

暂无评论

Mass-Based Density Peaks Clustering algorithm 10th

Mass-Based Density Peaks Clustering Algorithm

引用

10th IFIP TC 12 International Conference on Intelligent Information Processing (IIP)

作者： Ling, Ding Xiao, Xu Asia Pacific Univ Technol & Innovat Sch Comp Technol & Gaming Dev Kuala Lumpur 57000 Malaysia China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Jiangsu Peoples R China

ISBN: (纸本)9783030008284;9783030008277

Density peaks clustering algorithm (dpc) relies on local-density and relative-distance of dataset to find cluster centers. However, the calculation of these attributes is based on Euclidean distance simply, and dpc is not satisfactory when dataset's density is uneven or dimension is higher. In addition, parameter d(c) only considers the global distribution of the dataset, a little change of d(c) has a great influence on small-scale dataset clustering. Aiming at these drawbacks, this paper proposes a mass-based density peaks clustering algorithm (Mdpc). Mdpc introduces a mass-based similarity measure method to calculate the new similarity matrix. After that, K-nearest neighbour information of the data is obtained according to the new similarity matrix, and then Mdpc redefines the local density based on the K-nearest neighbour information. Experimental results show that Mdpc is superior to dpc, and satisfied on datasets with uneven density and higher dimensions, which also avoids the influence of d(c) on the small-scale datasets.

关键词： dpc algorithm Mass-based similarity measure Decision graph Uneven density Higher dimensions

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：