版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Zhejiang Univ Technol Coll Comp Sci & Technol Hangzhou 310023 Zhejiang Peoples R China
出 版 物:《IEEE TRANSACTIONS ON FUZZY SYSTEMS》 (IEEE模糊系统汇刊)
年 卷 期:2019年第27卷第9期
页 面:1726-1737页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China Natural Science Foundation of Zhejiang Province [LY16F020032]
主 题:Clustering algorithm document categorization fuzzy clustering semisupervised learning
摘 要:Pairwise constraint is a type of side information that is widely considered in existing semisupervised clustering approaches. In this paper, we explore a new form of supervision for clustering. We consider the partition results of a number of subsets as additional information to assist clustering. Compared to the pairwise constraint, which only involves the must-link or cannot-link relationship of two objects, the partition of a subset of objects provides information about the group structure of more objects and hence can possibly serve as a more effective form of supervision for clustering. In this paper, we instantiate the idea of clustering with subset partitions under the fuzzy clustering framework for document categorization. The proposed fuzzy clustering approach is formulated to learn from the partition of subsets and has the ability to handle high-dimensional document data. Specifically, the partition results of subsets are collectively transformed into pairwise relationships, based on which a penalty term is constructed and incorporated into a cosine-distance-based fuzzy c-means approach. The experimental results on benchmark data sets demonstrate the effectiveness of the proposed approach for a semisupervised document clustering.