In fuzzy clustering soft cluster partitions are formed based on the similarity of data points to the respective cluster prototypes. Similarity is defined in terms of simultaneous closeness regarding all attributes. In...
详细信息
In fuzzy clustering soft cluster partitions are formed based on the similarity of data points to the respective cluster prototypes. Similarity is defined in terms of simultaneous closeness regarding all attributes. In some applications the values of many attributes have been measured, but a natural clustering, if it exists, occurs within a (small) subset of attributes. The remaining dimensions can be considered irrelevant. They can obscure an existing grouping and make it harder to discover the cluster structure. In probabilistic fuzzy clustering irrelevant attributes can lead to coincidental cluster centers in the worst case. We study this effect in detail as well as the robustness of different similarity functions and their possible parameterizations against irrelevant input dimensions. Empirical evidence is given for the different properties of the membership functions
Real life transaction data often miss some occurrences of items that are actually present. As a consequence some potentially interesting frequent patterns cannot be discovered, since with exact matching the number of ...
详细信息
Real life transaction data often miss some occurrences of items that are actually present. As a consequence some potentially interesting frequent patterns cannot be discovered, since with exact matching the number of supporting transactions may be smaller than the user-specified minimum. In order to allow approximate matching during the mining process, we propose an approach based on transaction editing. Our recursive algorithm relies on a step by step elimination of items from the transaction database together with a recursive processing of transaction subsets. This algorithm works without complicated data structures and allows us to find fuzzy frequent patterns easily.
We study an extension of fuzzy learning vector quantization that draws on ideas from the more sophisticated approaches to fuzzy clustering, enabling us to find fuzzy clusters of ellipsoidal shape and differing size wi...
详细信息
We study an extension of fuzzy learning vector quantization that draws on ideas from the more sophisticated approaches to fuzzy clustering, enabling us to find fuzzy clusters of ellipsoidal shape and differing size with a competitive learning scheme. This approach may be seen as a kind of online fuzzy clustering, which can have advantages w.r.t. the execution time of the clustering algorithm. We demonstrate the usefulness of our approach by applying it to document collections, which are, in general, difficult to cluster due to the high number of dimensions and the special distribution characteristics of the data
In many applications the objects to cluster are described by quantitative as well as qualitative features. A variety of algorithms has been proposed for unsupervised classification if fuzzy partitions and descriptive ...
详细信息
In many applications the objects to cluster are described by quantitative as well as qualitative features. A variety of algorithms has been proposed for unsupervised classification if fuzzy partitions and descriptive cluster prototypes are desired. However, most of these methods are designed for data sets with variables measured in the same scale type (only categorical, or only metric). We propose a new fuzzy clustering approach based on a probabilistic distance measure. Thus a major drawback of present methods can be avoided which ties in the vulnerability to favor one type of attributes.
A significant number of scientific and economic problems is characterised by a large number of interrelated variables. But with larger variable number, the domain under consideration may grow fast, so that analyses an...
详细信息
A significant number of scientific and economic problems is characterised by a large number of interrelated variables. But with larger variable number, the domain under consideration may grow fast, so that analyses and reasoning become increasingly difficult. Graphical models allow to represent the combined distributions compactly and are suitable for dealing with uncertain and incomplete information. We describe their application to a problem of industrial planning. We also demonstrate how the iterative planning process can be supported by allowing the users to adapt the model using revision and updating operators. Moreover we discuss the problem of inconsistent inputs.
We explore how techniques that were developed to improve the training process of artificial neural networks can be used to speed up fuzzy clustering. The basic idea of our approach is to regard the difference between ...
详细信息
We explore how techniques that were developed to improve the training process of artificial neural networks can be used to speed up fuzzy clustering. The basic idea of our approach is to regard the difference between two consecutive steps of the alternating optimization scheme of fuzzy clustering as providing a gradient, which may be modified in the same way as the gradient of neural network back-propagation is modified in order to improve training. Our experimental results show that some methods actually lead to a considerable acceleration of the clustering process.
We explore an approach to possibilistic fuzzy clustering that avoids a severe drawback of the conventional approach, namely that the objective function is truly minimized only if all cluster centers are identical. Our...
详细信息
We explore an approach to possibilistic fuzzy clustering that avoids a severe drawback of the conventional approach, namely that the objective function is truly minimized only if all cluster centers are identical. Our approach is based on the idea that this undesired property can be avoided if we introduce a mutual repulsion of the clusters, so that they are forced away from each other. We develop this approach for the possibilistic fuzzy c-means algorithm and the Gustafson-Kessel algorithm.
We propose an approach to handling class information in fuzzy cluster analysis, where a class can consist of several clusters. The approach is based on a penalty term for clusters comprising several clusters.
We propose an approach to handling class information in fuzzy cluster analysis, where a class can consist of several clusters. The approach is based on a penalty term for clusters comprising several clusters.
The K2 metric is a well-known evaluation measure (or scoring function) for learning Bayesian networks from data [7]. It is derived by assuming uniform prior distributions on the values of an attribute for each possibl...
详细信息
Fuzzy cluster analysis is a method for unsupervised clustering. However sometimes class information is available for the given dataset, i.e., only the number of clusters per class is unknown. In this paper it is discu...
详细信息
Fuzzy cluster analysis is a method for unsupervised clustering. However sometimes class information is available for the given dataset, i.e., only the number of clusters per class is unknown. In this paper it is discussed how class information can be exploited. Some common approaches are reviewed and a new approach is suggested, which integrates class information into fuzzy cluster analysis.
暂无评论