This paper proposes a new approach to the k -means clustering algorithm incorporating a variety of fuzzy equivalences and aggregation techniques. Central to this approach is the use of fuzzy equivalences, which serve ...
详细信息
This paper proposes a new approach to the k -means clustering algorithm incorporating a variety of fuzzy equivalences and aggregation techniques. Central to this approach is the use of fuzzy equivalences, which serve as an alternative to standard distance metrics, with the goal of improving the clustering process. The modified k-means algorithm employs these fuzzy equivalences to better determine the similarity between the data points, particularly in situations where the nearest points may not adequately represent closeness within the data set. To assess the clustering results, we employ a variation of the silhouette coefficient tailored to our method. Additionally, we present theoretical insights into the behavior and benefits of using composition of aggregations and fuzzy equivalences in clustering. Experimental validations carried out on diverse data sets indicate that our method can lead to improved clustering outcomes compared to traditional techniques.
This paper discusses a suitable framework for generalizing the k -nearest neighbor ( k -NNR) algorithms to cases where the design labels are not necessarily crisp, i.e., not binary-valued. The proposed framework imbed...
详细信息
This paper discusses a suitable framework for generalizing the k -nearest neighbor ( k -NNR) algorithms to cases where the design labels are not necessarily crisp, i.e., not binary-valued. The proposed framework imbeds all crisp k -NNR's into a larger structure of fuzzy k -NNR's. The resultant model enables neighborhood voting to be a continuous function of local labels at a point to be classified. We emphasize that the decision itself may be crisp even when a fuzzy k -NNR is utilized. The usefulness of this extension of the conventional technique is illustrated by comparing the observed error rates of four classifiers (the hard k -NNR, two fuzzy k -NNR's, and a fuzzy 1-nearest prototype rule (1-NPR) on three data sets: Anderson's Iris data, and samples from (synthetic) univariate and bivariate normal mixtures. Our conclusions: all four designs yield comparable (usually within 4%) error rates; the Fuzzy c -means (FCM) based k -NNR is usually the best design; the FCM/1-NPR is the most efficient and perhaps most useful of the four designs; and finally, that generalized NNR's are an important and useful extension of the conventional ones.
暂无评论