Consensus clustering is the problem of reconciling clustering information about the same data set coming from different sources or from different runs of the same algorithm. Cast as an optimization problem, consensus ...
详细信息
ISBN:
(纸本)9780898716535
Consensus clustering is the problem of reconciling clustering information about the same data set coming from different sources or from different runs of the same algorithm. Cast as an optimization problem, consensus clustering is known as median partition, and has been shown to be NP-complete. A number of heuristics have been proposed as approximate solutions, some with performance guarantees. In practice, the problem is apparently easy to approximate, but guidance is necessary as which heuristic to use depending on the number of elements and clusterings given. We have implemented a number of heuristic;for the consensus clustering problem, and here we compare their performance, independent of data size, in terms of efficacy and efficiency, on both simulated and real data sets. We find that based on the underlying algorithms and their behavior in practice the heuristics can be categorized into two distinct groups, with ramification as to which one to use in a given situation, and that a hybrid solution is the best bet in general. We have also developed a refined consensus clustering heuristic for the occasions when the given clusterings may be too disparate, and their consensus may not be representative of any one of them, and we show that in practice the refined consensus clusterings can be much superior to the general consensus clustering.
Although many algorithms have been proposed for the camera-based detection of road features (such as road markings, curbstones and road borders), truly contextual or relational information between the detections is ra...
详细信息
ISBN:
(纸本)9781509018895
Although many algorithms have been proposed for the camera-based detection of road features (such as road markings, curbstones and road borders), truly contextual or relational information between the detections is rarely used. This is all the more surprising, since a lot of potential remains unused, regarding outlier rejection or compensating detection failures, multiple detections, misclassification or fragmentation. The aim of this paper is to present an approach that is suitable for such a task in both online and offline applications as a post-processing step after the actual detection and classification step. This is achieved by adapting a perception-based line-clustering algorithm that groups the pre-classified road features based on their relations and assigns them a final class. The grouped features are then fused to form continuous lines instead of individual dashes or fragmented lines. The evaluation on a 10 km drive in both rural and urban environment, as well as an online test on a short highway driving sequence shows that this approach is very well capable to increase the performance of road feature detection at a low computational cost.
Computational morphological analysis comprises the development of measures (indicators) that describe different form attributes of a neuron and provides additional parameters for classification algorithms. Our work ad...
详细信息
This article uses analytical methods to assess reductions in total costs of telematic systems that can result from common infrastructure utilization. Analytical methods based on clustering and K-minimum spanning tree ...
详细信息
ISBN:
(纸本)9781457721977
This article uses analytical methods to assess reductions in total costs of telematic systems that can result from common infrastructure utilization. Analytical methods based on clustering and K-minimum spanning tree can be adopted for finding clusters or sets which maximize reductions in total system costs due to infrastructure sharing between telematic systems. Efficient integration of telematic systems through infrastructure sharing can positively influence telematic service interoperability while reducing costs. Results show the measure of synergy for each K-value, as well as total cost savings of up to 2%.
Recent works in unsupervised learning have emphasized the need to understand a new trend in algorithmic design, which is to influence the clustering via weights on the instance points. In this paper, we handle cluster...
详细信息
ISBN:
(纸本)0898715687
Recent works in unsupervised learning have emphasized the need to understand a new trend in algorithmic design, which is to influence the clustering via weights on the instance points. In this paper, we handle clustering as a constrained minimization of a Bregman divergence. Theoretical results show benefits resembling those of boosting algorithms, and bring new modified weighted versions of clustering algorithms such as k-means, expectation-maximization (EM) and k-harmonic means. Experiments display the quality of the results obtained, and corroborate the advantages that subtle data reweightings may bring to clustering.
Partitional algorithms form an extremely popular class of clustering algorithms. Primarily, these algorithms can be classified into two sub-categories: a) k-means based algorithms that presume the knowledge of a suita...
详细信息
ISBN:
(纸本)9783642172977
Partitional algorithms form an extremely popular class of clustering algorithms. Primarily, these algorithms can be classified into two sub-categories: a) k-means based algorithms that presume the knowledge of a suitable k, and b) algorithms such as Leader, which take a distance threshold value, tau, as an input. In this work, we make the following contributions. We 1) propose a novel technique, EPIC, which is based on both the number of clusters, k and the distance threshold, tau, 2) demonstrate that the proposed algorithm achieves better performance than the standard k-means algorithm, and 3) present a generic scheme for integrating EPIC into different classification algorithms to reduce their training time complexity.
We present a novel framework that applies a meta-learning approach to clustering algorithms. Given a dataset, our meta-learning approach provides a ranking for the candidate algorithms that could be used with that dat...
详细信息
ISBN:
(纸本)9781424418206
We present a novel framework that applies a meta-learning approach to clustering algorithms. Given a dataset, our meta-learning approach provides a ranking for the candidate algorithms that could be used with that dataset. This ranking could, among other things, support non-expert users in the algorithm selection task. In order to evaluate the framework proposed, we implement a prototype that employs regression support vector machines as the meta-learner. Our case study is developed in the context of cancer gene expression microarray datasets.
A tool to provide an idea of the content of a given video is becoming a need in the current Web scenario, where the presence of videos is increasing day after day. Dynamic summarization techniques can be used to this ...
详细信息
ISBN:
(纸本)9781424414567
A tool to provide an idea of the content of a given video is becoming a need in the current Web scenario, where the presence of videos is increasing day after day. Dynamic summarization techniques can be used to this aim as they set up a video abstract, by selecting and sequencing short video clips extracted from the original video. Needless to say, the selection process is critical. In this paper we focus our attention on clustering algorithms to provide such selection and we investigate the effects of their employment in the web scenario. clustering algorithms are very effecting in producing static video summary, but few works consider them for video abstract production. For this reason, we set up an experimental scenario where we investigate their performance considering different categories of video, different abstract lengths and different low-level video analysis. Results show that clustering techniques can be useful only for some categories of videos and only if the selection process is based on video scene characteristics. Furthermore, the investigation also shows that to provide a customized service (user can freely decide the abstract time length), only fast clustering algorithm should be used.
Since little prior knowledge about remote sensing images can be obtained before performing recognition tasks, various unsupervised classification methods have been applied to solve such problem. Therefore, choosing an...
详细信息
ISBN:
(纸本)9783540877318
Since little prior knowledge about remote sensing images can be obtained before performing recognition tasks, various unsupervised classification methods have been applied to solve such problem. Therefore, choosing an appropriate clustering method is very critical to achieve good results. However, there is no standard criterion on which clustering method is more suitable or more effective. In this paper, we conduct a comparative study on three clustering methods, including C-Means, Finite Mixture Model clustering. and Affinity Propagation. The advantages and disadvantages of each method are evaluated by experiments and classification results.
One of the aspects, of a clustering algorithm that should be considered for choosing an appropriate algorithm in an unsupervised learning task is stability. A clustering algorithm is stable (on a dataset) if it result...
详细信息
ISBN:
(纸本)9780769534404
One of the aspects, of a clustering algorithm that should be considered for choosing an appropriate algorithm in an unsupervised learning task is stability. A clustering algorithm is stable (on a dataset) if it results in the same clustering as it performed on the whole dataset, when actually performs on a (sub)sample of the dataset. In this paper, we report the results of an empirical study on the stability of two clustering algorithms, namely k-Means and normalized spectral clustering, along with some analysis on those results that are useful for practitioners who deal with scalability and researchers who employ stability as a tool for model selection.
暂无评论