clustering algorithms are useful whenever one needs to classify an excessive amount of information into a set of manageable and meaningful subsets. Using an analogy from vector analysis, a clustering algorithm can be ...
详细信息
ISBN:
(纸本)0819440620
clustering algorithms are useful whenever one needs to classify an excessive amount of information into a set of manageable and meaningful subsets. Using an analogy from vector analysis, a clustering algorithm can be said to divide up state space into discrete "chunks" such that each vector lies within one chunk. These vectors can best be thought of as sets of features. A canonical vector for each region of state space is chosen to represent all vectors which are located within that region. The following paper presents a survey of clustering algorithms. It pays particular attention to those algorithms that require the least amount of a priori knowledge about the domain being clustered. In the current work, an algorithm is compelling to the extent that it minimizes any assumptions about the distribution of vectors being classified.
Among the power system corrective controls, defensive islanding is considered as the last resort to secure the system from severe cascading contingencies. The primary motive of defensive islanding is to limit the affe...
详细信息
Unsupervised segmentation is a key step towards automatic analysis and understanding of MR images. A number of techniques based on multi-dimensional data classification have been applied to this problem. Since most un...
详细信息
Distance measure plays a vital role in clustering algorithms. Selecting the right distance measure for a given dataset is a challenging problem. In this paper, the effect of six distance measures on three clustering a...
详细信息
User generated content from fora, weblogs and other social networks is a very fast growing data source in which different information extraction algorithms can provide a convenient data access. Hierarchical clustering...
详细信息
In data mining, clustering is the most popular, powerful and commonly used unsupervised learning technique. It is a way of locating similar data objects into clusters based on some similarity. clustering algorithms ca...
详细信息
A symmetric dataset is defined as an n x n dataset that when transposed, it is equal to that of prior transposed. In data mining algorithms that employ vertical data structure1, symmetric datasets are used, for exampl...
详细信息
ISBN:
(纸本)9781604234961
A symmetric dataset is defined as an n x n dataset that when transposed, it is equal to that of prior transposed. In data mining algorithms that employ vertical data structure1, symmetric datasets are used, for example, in clustering gene expression pattern and density-based clustering algorithms. In the former, a symmetric dataset is the pre-computed pairwise similarity of genes, while in the latter, a symmetric dataset is the pre-computed pair-wise Euclidian distance of objects. We realized that when n is large, arranging a symmetric dataset into n x n is impractical for vertical algorithms because a large number of vertical bit sequences must be generated. In this paper, we propose an alternative arrangement of symmetric datasets for vertical clustering algorithms. The dataset is arranged in n' x m, where n' m, instead of n x n, but the number of elements remains the same. In other words, the cardinality of the dataset is extended, but the dimension is narrowed. More importantly, the results should be similar to the results when a symmetric dataset is arranged in n x n. The experimental results show that the proposed arrangement greatly expedites the time to generate and load the vertical bit vectors.
The successful provision of context aware services entails the attainment of equilibrium between the extent of personalization desired and the user's need for privacy. Two are the major elements that play a signif...
详细信息
Data mining is a business-effective technology to provide customer experience enhancement and alleviate the process of decision-making along the digital transformation journey. The main goal of this research paper is ...
详细信息
Word Sense Disambiguation in text is still a difficult problem as the best supervised methods require laborious and costly manual preparation of training data. Thus, this work focuses on evaluation of a few selected c...
详细信息
暂无评论