Recent advancement in microarray technology permits monitoring of the expression levels of a large set of genes across a number of time points simultaneously. For extracting knowledge from such huge volume of microarr...
详细信息
Recent advancement in microarray technology permits monitoring of the expression levels of a large set of genes across a number of time points simultaneously. For extracting knowledge from such huge volume of microarray gene expression data, computational analysis is required. Clustering is one of the important data mining tools for analyzing such microarray data to group similar genes into clusters. Researchers have proposed a number of clustering algorithms in this purpose. In this article, an attempt has been made in order to improve the performance of fuzzy clustering by combining it with support vector machine (SVM) classifier. A recently proposed real-coded variable string length genetic algorithm based clustering technique and an iterated version of fuzzy C-means clustering have been utilized in this purpose. The performance of the proposed clustering scheme has been compared with that of some well-known existing clustering algorithms and their SVM boosted versions for one simulated and six real life gene expression data sets. Statistical significance test based on analysis of variance (ANOVA) followed by posteriori Tukey-Kramer multiple comparison test has been conducted to establish the statistical significance of the superior performance of the proposed clustering scheme. Moreover biological significance of the clustering solutions have been established. (C) 2009 Elsevier Ltd. All rights reserved.
A popular approach for landcover classification in remotely sensed satellite images is clustering the pixels in the spectral domain into several fuzzy partitions. It has been observed that performance of the clusterin...
详细信息
A popular approach for landcover classification in remotely sensed satellite images is clustering the pixels in the spectral domain into several fuzzy partitions. It has been observed that performance of the clustering algorithms deteriorate with more and more overlaps in the data sets. Motivated by this observation, in this article a two-stage fuzzy clustering algorithm is described that utilizes the concept of points having significant membership to multiple classes. The points situated in the overlapped regions of different clusters are first identified and excluded from consideration while clustering. Thereafter, these points are given class labels based on Support vector Machine classifier which is trained by the remaining points. The well known Fuzzy C-Means algorithm and some recently proposed genetic clustering schemes are utilized in the process. The effectiveness of the two-stage clustering technique has been demonstrated on IRS remote sensing satellite images of the cities of Bombay and Calcutta and compared with other well known clustering techniques. Also statistical significance test has been carried out to establish the statistical significance of the clustering results.
The problem of classifying an image into different homogeneous regions is viewed as the task of clustering the pixels in the intensity space. In particular, satellite images contain landcover types some of which cover...
详细信息
The problem of classifying an image into different homogeneous regions is viewed as the task of clustering the pixels in the intensity space. In particular, satellite images contain landcover types some of which cover significantly large areas, while some (e.g., bridges and roads) occupy relatively much smaller regions. Automatically detecting regions or clusters of such widely varying sizes presents a challenging task. In this paper, a newly developed real-coded variablestringlengthgenetic fuzzy clustering technique with a new point symmetry distance is used for this purpose. The proposed algorithm is capable of automatically determining the number of segments present in an image. Here assignment of pixels to different clusters is done based on the point symmetry based distance rather than the Euclidean distance. The cluster centers are encoded in the chromosomes, and a newly developed fuzzy point symmetry distance based cluster validity index, FSym-index, is used as a measure of the validity of the corresponding partition. This validity index is able to correctly indicate presence of clusters of different sizes and shapes as long as they are internally symmetrical. The space and time complexities of the proposed algorithm are also derived. The effectiveness of the proposed technique is first demonstrated in identifying two small objects from a large background from an artificially generated image and then in identifying different landcover regions in remote sensing imagery. Results are compared with those obtained using the well known fuzzy C-means algorithm both qualitatively and quantitatively.
This article presents an efficient two-stage clustering method for clustering microarray gene expression time series data. The algorithm is based on the identification of genes having significant membership to multipl...
详细信息
ISBN:
(纸本)9780769526355
This article presents an efficient two-stage clustering method for clustering microarray gene expression time series data. The algorithm is based on the identification of genes having significant membership to multiple classes. A recently proposed variablestringlengthgenetic scheme and an iterated version of well known fuzzy C-means algorithm are utilized as the underlying clustering techniques. The performance of the two-stage clustering technique has been compared with the hierarchical clustering algorithms, those are widely used for clustering gene expression data, to prove its effectiveness on some publicly available gene expression data.
An evolutionary approach for designing a ligand molecule that can bind to the active site of a target protein is described in this article. An earlier attempt in this regard assumed a fixed tree structure of the ligan...
详细信息
The problem of classifying an image into different homogeneous regions is viewed as the task of clustering the pixels in the intensity space. Real-coded variablestringlengthgenetic fuzzy clustering with automatic e...
详细信息
The problem of classifying an image into different homogeneous regions is viewed as the task of clustering the pixels in the intensity space. Real-coded variablestringlengthgenetic fuzzy clustering with automatic evolution of clusters is used here for this purpose. The cluster centers are encoded in the chromosomes, and the Xie-Beni index is used as a measure of the validity of the corresponding partition. The effectiveness of the proposed technique is demonstrated for classifying different landcover regions in remote sensing imagery. Results are compared with those obtained using the well-known fuzzy C-means algorithm.
An analogy between a geneticalgorithm based pattern classification scheme (where hyperplanes are used to approximate the class boundaries through searching) and multilayer perceptron (MLP) based classifier is establi...
详细信息
An analogy between a geneticalgorithm based pattern classification scheme (where hyperplanes are used to approximate the class boundaries through searching) and multilayer perceptron (MLP) based classifier is established. Based on this, a method for determining the MLP architecture automatically is described. It is shown that the architecture would need almost two hidden layers, the neurons of which are responsible for generating hyperplanes and regions. The neurons in the second hidden and output layers perform the AND & OR functions respectively. The methodology also includes a post processing step which automatically removes any redundant neuron in the hidden/output layer. An extensive comparative study of the performance of the MLP, thus derived using the proposed method, with those of several other conventional MLPs is presented for different data sets.
暂无评论