Since 1998, a graphical representation used in visual clustering called the reordered dissimilarity image or cluster heat map has appeared in more than 4000 biological or biomedical publications. These images are typi...
详细信息
Since 1998, a graphical representation used in visual clustering called the reordered dissimilarity image or cluster heat map has appeared in more than 4000 biological or biomedical publications. These images are typically used to visually estimate the number of clusters in a dataset, which is the most important input to most clustering algorithms, including the popularly chosen fuzzy c-means and crisp k-means. This paper presents a new formulation of a matrix reordering algorithm, coVAT, which is the only known method for providing visual clustering information on all four types of cluster structure in rectangular relational data. Finite rectangular relational data are an m x n array R of relational values between m row objects Or and n column objects Oc. R presents four clustering problems: clusters in Or, Oc, Or?c, and coclusters containing some objects from each of Or and Oc. coVAT1 is a clustering tendency algorithm that provides visual estimates of the number of clusters to seek in each of these problems by displaying reordered dissimilarity images. We provide several examples where coVAT1 fails to do its job. These examples justify the introduction of coVAT2, a modification of coVAT1 based on a different reordering scheme. We offer several examples to illustrate that coVAT2 may detect coclusters in R when coVAT1 does not. Furthermore, coVAT2 is not limited to just relational data R. The R matrix can also take the form of feature data, such as gene microarray data where each data element is a real number: Positive values indicate upregulation, and negative values indicate downregulation. We show examples of coVAT2 on microarray data that indicate coVAT2 shows cluster tendency in these data. (c) 2012 Wiley Periodicals, Inc.
Agglutinative languages, such as Hungarian, use inflection to modify the meaning of words. Inflection is a string transformation which describe how can a word converted into its inflected form. The transformation can ...
详细信息
ISBN:
(纸本)9781479959969
Agglutinative languages, such as Hungarian, use inflection to modify the meaning of words. Inflection is a string transformation which describe how can a word converted into its inflected form. The transformation can be described by a transformational string. The words can be classified by their transformational string, so inflection is considered as a classification. Linear separability of clusters is important to create an efficient and accurate classification method. This paper review a linear programming based testing method of linear separability. This method was analyzed on generated datasets, these measurements showed the time cost of the algorithm grows polynomially with the number of the points. The accusative case of Hungarian was used to create a dataset of 56.000 samples. The words were represented in vector space by alphabetical and phonetic encoding and left and right adjust, thus four different representation of words were used during the tests. Our test results showed there are non linear separable cluster pairs in both of the representations.
暂无评论