In this paper the ant colony optimization (ACO) is used in the k-meansalgorithm for improving the image segmentation. The learning mechanism of this algorithm is formulated by using the ACO meta-heuristic. As the phe...
详细信息
ISBN:
(纸本)9781450300643
In this paper the ant colony optimization (ACO) is used in the k-meansalgorithm for improving the image segmentation. The learning mechanism of this algorithm is formulated by using the ACO meta-heuristic. As the pheromone dominates the exploration of ants for problem solutions, preliminary experiments on pheromone's update are reported. Two methods for defining and updating pheromone values are proposed and tested: one with the spatial coordinate distances and the other without using such a distance. The ACO improves the k-meansalgorithm by making it less dependent on the initial parameters.
Most of the business decisions are based on cost and benefit considerations. Data mining techniques that make it possible for the businesses to incorporate financial considerations will be more meaningful to the decis...
详细信息
Most of the business decisions are based on cost and benefit considerations. Data mining techniques that make it possible for the businesses to incorporate financial considerations will be more meaningful to the decision makers. Decision theoretic framework has been helpful in providing a better understanding of classification models. This study describes a semi-supervised decision theoretic rough set model. The model is based on an extension of decision theoretic model proposed by Yao. The proposal is used to model financial cost/benefit scenarios for a promotional campaign in a real-world retail store.
作者:
Cao, FuyuanLiang, JiyeJiang, GuangShanxi Univ
Sch Comp & Informat Technol Taiyuan 030006 Shanxi Peoples R China Minist Educ
Key Lab Computat Intelligence & Chinese Informat Taiyuan 030006 Peoples R China Chinese Acad Sci
Key Lab Intelligent Informat Proc Inst Comp Technol Beijing 100190 Peoples R China
As a simple clustering method, the traditional k-meansalgorithm has been widely discussed and applied in pattern recognition and machine learning. However, the k-meansalgorithm could not guarantee unique clustering ...
详细信息
As a simple clustering method, the traditional k-meansalgorithm has been widely discussed and applied in pattern recognition and machine learning. However, the k-meansalgorithm could not guarantee unique clustering result because initial cluster centers are chosen randomly. In this paper, the cohesion degree of the neighborhood of an object and the coupling degree between neighborhoods of objects are defined based on the neighborhood-based rough set model. Furthermore, a new initialization method is proposed, and the corresponding time complexity is analyzed as well. We study the influence of the three norms on clustering, and compare the clustering results of the k-means with the three different initialization methods. The experimental results illustrate the effectiveness of the proposed method. (C) 2009 Elsevier Ltd. All rights reserved.
Surface water contamination from agricultural and urban runoff and wastewater discharges from industrial and municipal activities is of major concern to people worldwide. Classical models can be insufficient to visual...
详细信息
Surface water contamination from agricultural and urban runoff and wastewater discharges from industrial and municipal activities is of major concern to people worldwide. Classical models can be insufficient to visualise the results because the water quality variables used to describe dynamic pollution sources are complex, multivariable, and nonlinearly related. Artificial intelligence techniques with the ability to analyse multivariant water quality data by means of a sophisticated visualisation capacity can offer an alternative to current models. In this study, the kohonen self-organising feature maps (SOM) neural network was initially applied to analyse the complex nonlinear relationships among multivariable surface water quality variables using the component planes of the variables to determine the complex behaviour of water quality parameters. The dependencies between water quality variables were extracted and interpreted using the pattern analysis visualised in component planes. For further investigation, the k-means clustering algorithm was used to determine the optimal number of clusters by partitioning the maps and utilising the Davies-Bouldin clustering index, leading to seven groups or clusters corresponding to water quality variables. The results reveal that the concentrations of Na, k, Cl, NH4-N, NO2-N, o-PO4, component planes of organic matter (pV), and dissolved oxygen (DO) were significantly affected by seasonal changes, and that the SOM technique is an efficient tool with which to analyse and determine the complex behaviour of multidimensional surface water quality data. These results suggest that this technique could also be applied to other environmentally sensitive areas such as air and groundwater pollution.
Document clustering or unsupervised document classification is an automated process of grouping documents with similar content. A typical technique uses a similarity function to compare documents. In the literature, m...
详细信息
Document clustering or unsupervised document classification is an automated process of grouping documents with similar content. A typical technique uses a similarity function to compare documents. In the literature, many similarity functions such as dot product or cosine measures are proposed for the comparison operator. For the thesis, we evaluate the effects a similarity function may have on clustering. We start by representing a document and a query, both as a vector of high-dimensional space corresponding to the keywords followed by using an appropriate distance measure in k-means to compute similarity between the document vector and the query vector to form clusters. Based on these clusters we decide the best distance metric for the document set used. Next, we compute time complexities for different similarity functions for the same model and document set based on the number of iterations and number of clusters.
Information about local protein sequence motifs is very important to the analysis of biologically significant conserved regions of protein sequences. These conserved regions can potentially determine the diverse confo...
详细信息
Information about local protein sequence motifs is very important to the analysis of biologically significant conserved regions of protein sequences. These conserved regions can potentially determine the diverse conformation and activities of proteins. In this work, recurring sequence motifs of proteins are explored with an improved k-means clustering algorithm on a new dataset. The structural similarity of these recurring sequence clusters to produce sequence motifs is studied in order to evaluate the relationship between sequence motifs and their structures. To the best of our knowledge, the dataset used by our research is the most updated dataset among similar studies for sequence motifs. A new greedy initialization method for the k-meansalgorithm is proposed to improve traditional k-meansclustering techniques. The new initialization method tries to choose suitable initial points, which are well separated and have the potential to form high-quality clusters. Our experiments indicate that the improved k-meansalgorithm satisfactorily increases the percentage of sequence segments belonging to clusters with high structural similarity. Careful comparison of sequence motifs obtained by the improved and traditional algorithms also suggests that the improved k-means clustering algorithm may discover some relatively weak and subtle sequence motifs, which are undetectable by the traditional k-meansalgorithms. Many biochemical tests reported in the literature show that these sequence motifs are biologically meaningful. Experimental results also indicate that the improved k-meansalgorithm generates more detailed sequence motifs representing common structures than previous research. Furthermore, these motifs are universally conserved sequence patterns across protein families, overcoming some weak points of other popular sequence motifs. The satisfactory result of the experiment suggests that this new k-meansalgorithm may be applied to other areas of bioinformatics resea
Presently, the optimization concept plays an important role in the problems related to engineering management and commerce etc. Recent trends in optimization, points towards the genetic algorithm and evolutionary appr...
详细信息
ISBN:
(纸本)9781424429271
Presently, the optimization concept plays an important role in the problems related to engineering management and commerce etc. Recent trends in optimization, points towards the genetic algorithm and evolutionary approaches. Different genetic algorithms are proposed, designed and implemented for the single objective as well as for the multiobjective problems. GAS3[2006](Genetic algorithm with Species and Sexual Selection) proposed by *** and *** is a distributed Quasi steady state real-coded genetic algorithm. In this work, we have modified GAS3 algorithm. We introduce a reclustering module in GAS3 after simple distance based parameter less clustering (species formation). GAS3kM (Modifying Genetic algorithm with Species and Sexual Selection by using k-meansalgorithm) uses k-means clustering algorithm for reclustering. Experimental results show that GAS3kM has outperformed GAS3 algorithm when tested on unimodal and multimodal test functions.
Thek-means clustering algorithm is a commonly used algorithm for palette design. If an adequate initial palette is selected, a good quality reconstructed image of a compressed colour image can be achieved. The major p...
详细信息
Thek-means clustering algorithm is a commonly used algorithm for palette design. If an adequate initial palette is selected, a good quality reconstructed image of a compressed colour image can be achieved. The major problem is that a great deal of computational cost is consumed. To accelerate thek-means clustering algorithm, two test conditions are employed in the proposed algorithm. From the experimental results, it is found that the proposed algorithm significantly cuts down the computational cost of thek-means clustering algorithm without incurring any extra distortion.
Conventional clusteringalgorithms categorize an object into precisely one cluster. In many applications, the membership of some of the objects to a cluster can be ambiguous. Therefore, an ability to specify membershi...
详细信息
ISBN:
(纸本)9783540884231
Conventional clusteringalgorithms categorize an object into precisely one cluster. In many applications, the membership of some of the objects to a cluster can be ambiguous. Therefore, an ability to specify membership to multiple clusters can be useful in real world applications. Fuzzy clustering makes it possible to specify the degree to which a given object belongs to a cluster. In Rough set representations, an object may belong to more than one cluster, which is more flexible than the conventional crisp clusters and less verbose than the fuzzy clusters. The unsupervised nature of fuzzy and rough algorithms means that there is a choice about the level of precision depending on the choice of parameters. This paper describes how one can vary the precision of the rough set clustering and studies its effect on synthetic and real world data sets.
This paper proposes a new similarity measure for the content-based image retrieval (CBIR) systems. The similarity measure is based on the multidimensional generalization of the Wald-Wolfowitz (MWW) runs test and the k...
详细信息
ISBN:
(纸本)9781424423354
This paper proposes a new similarity measure for the content-based image retrieval (CBIR) systems. The similarity measure is based on the multidimensional generalization of the Wald-Wolfowitz (MWW) runs test and the k-means clustering algorithm. The performance comparisons between the proposed method and the current CBIR method based on MWW runs test were performed, and it can be seen that the proposed methods outperform the current method in the sense that the proposed method provides higher performance than the current method for the same computational time.
暂无评论