In this paper, at first a new point symmetry-based similarity measurement is proposed which satisfies the closure and the symmetry properties of any distance function. The different desirable properties of the new dis...
详细信息
In this paper, at first a new point symmetry-based similarity measurement is proposed which satisfies the closure and the symmetry properties of any distance function. The different desirable properties of the new distance are elaborately explained. Thereafter a new clusteringalgorithmbased on the search capability of geneticalgorithm is developed where the newly developed point symmetry-based distance is used for cluster assignment. The allocation of points to different clusters is performed in such a way that the closure property is satisfied. The proposed GA with newly developed point symmetry distance based (GAnPS) clusteringalgorithm is capable of determining different symmetrical shaped clusters having any sizes or convexities. The effectiveness of the proposed GAnPS clustering technique in identifying the proper partitioning is shown for twenty-one data sets having various characteristics. Performance of GAnPS is compared with existing symmetry-basedgeneticclustering technique, GAPS, three popular and well-known clustering techniques, K-means, expectation maximization and average linkage algorithm. In a part of the paper, the utility of the proposed clustering technique is shown for partitioning a remote sensing satellite image. The last part of the paper deals with the development of some automatic clustering techniques using the newly proposed symmetry-based distance.
Categorical data clustering has been gaining significant attention from researchers, because most of the real life data sets are categorical in nature. In contrast to numerical domain, no natural ordering can be found...
详细信息
ISBN:
(纸本)9781424428052
Categorical data clustering has been gaining significant attention from researchers, because most of the real life data sets are categorical in nature. In contrast to numerical domain, no natural ordering can be found among the elements of a categorical domain. Hence no inherent distance measure, like the Euclidean distance, would work to compute the distance between two categorical objects. In this article, geneticalgorithm and simulated annealing based categorical data clusteringalgorithm has been proposed. The performance of the proposed algorithm has been compared with that of different well known categorical data clusteringalgorithms and demonstrated for a variety of artificial and real life categorical data sets.
暂无评论