版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:LPAIS Laboratory Faculty of Sciences Sidi Mohamed Ben Abdellah University Fez Morocco Laboratory of Microbial Biotechnology and Bioactive Molecules Science and Technologies Faculty Sidi Mohamed Ben Abdellah University Fez Morocco Science and Technologies Faculty University Hassan 1st Settat Morocco
出 版 物:《Multimedia Tools and Applications》 (Multimedia Tools Appl)
年 卷 期:2025年第84卷第10期
页 面:7835-7861页
核心收录:
学科分类:1205[管理学-图书情报与档案管理] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:The authors declare that no funds grants or other support were received during the preparation of this manuscript
主 题:Clustering algorithms
摘 要:Clustering is a method in data mining that belongs to the category of unsupervised learning. Cluster analysis categorizes data into different classes by identifying the internal organization of objects in the data set and their relationships. Many clustering methods are designed with specific assumptions about the underlying data distribution or cluster shapes. If these assumptions do not hold in a particular dataset, the performance of the clustering algorithm may suffer. This paper introduces a novel frequency clustering method called CFI (Clustering based on Frequent Itemsets). This innovative approach opens up a new avenue for research in frequency clustering, departing from conventional distance-based methods. CFI has the potential to reveal compelling patterns or associations among features in the data. The CFI algorithm includes three main steps. Firstly, we generate frequent item sets. Secondly, we built the centroids of each cluster based on a new measure called FI-distance, which combines the Euclidian distance with a similarity measure for item sets. Third, each object is assigned to the appropriate cluster based on its membership degree. Various experiments were conducted on synthetic and real-world datasets, utilizing three performance criteria: the Davies Bouldin score, the silhouette width criterion, and the Calinski-Harabasz score. The CFI method was initially compared to state-of-the-art methods and subsequently to automatic clustering methods. The results indicate the superiority of the CFI algorithm compared to the other methods. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.