检索结果-内蒙古大学图书馆

International Conference of Information Technology and Electrical Engineering (ICITEE)

作者： Zhou, Gongjian Xiamen Univ Tan Kah Kee Coll Zhangzhou Fujian Peoples R China

ISBN: (纸本)9781450363525

How to apply clustering algorithm to effectively cluster large-scale data is an important research topic in data mining. Based on an in-depth analysis of the Hadoop platform architecture and canopy-kmeans clustering algorithm, the canopy-kmeans algorithm was optimized and parallelized. The data packets are clustered after grouping and sampling by statistical thinking to facilitate parallelization and reduce time complexity. The canopy initial center point selection was optimized using the minimum-maximum principle, and data outlier average sampling method was used to ensure the uniform extraction of data samples from the original data, and the k-means iterative calculation process was optimized. Combined with the MapReduce framework under the Hadoop platform, the improved algorithm is designed and implemented in parallel. Experiments show that the improved canopy-kmeans parallel algorithm is effective and convergent when clustering massive amounts of numerical data, and it has a certain degree of improvement in the clustering accuracy and timeliness.

关键词： Hadoop MapReduce cluster analysis K-means algorithm canopy-kmeans algorithm Speedup Scalability

来源：评论

学校读者我要写书评

暂无评论

Abnormal Electricity Detection of Users Based on Improved canopy-kmeans and Isolation Forest algorithms

引用

IEEE ACCESS 2024年 12卷 99110-99121页

作者： Wang, Jianyuan Li, Xiaoyao Northeast Elect Power Univ Key Lab Modern Power Syst Simulat & Control & Rene Minist Educ Jilin 132012 Peoples R China

Aiming at the existing user abnormal electricity consumption detection methods that have the problem of difficult classification of user similar electricity consumption patterns, this paper proposes an unsupervised isolation forest abnormal electricity consumption detection model based on the canopy-kmeans algorithm with weighted density improvement. To start, we propose a composite parameter analysis method for user electricity consumption patterns, volatility, trends, and correlations using Irish smart meter data. This method involves joint data cleaning, interpolation, and feature construction. Additionally, principal component analysis is introduced to fuse features across layers and reduce dimensionality in user electricity consumption. Subsequently, we introduce the weighted density improvement canopy-kmeans clustering algorithm. This algorithm determines the K value and clustering centers using the maximum weight product method, based on definitions of sample density, average intra-class sample distance, and inter-class distance in the multilayer fusion feature data. Finally, we propose a fusion mechanism of weighted density improvement canopy-kmeans and isolation forest algorithms to jointly construct a model for detecting abnormal power usage based on multilayer fusion feature data analysis. The results demonstrate that multilayer fusion feature parameters vary in size and discretization among different user types, enabling classification of users with diverse electricity consumption patterns. Moreover, the anomaly detection model based on multilayer fusion feature data analysis improves accuracy rates, recall rates, and F1 scores compared to other algorithms.

关键词： Electricity Clustering algorithms Classification algorithms Anomaly detection Forestry Prediction algorithms Feature extraction Detection algorithms Energy consumption Unsupervised learning Abnormal detection of electricity consumption by users canopy-kmeans algorithm isolation forest algorithm unsupervised learning weighted density

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：