This paper studies the multivehicle task assignment problem where several dispersed vehicles need to visit a set of target locations in a time-invariant drift field while trying to minimize the total travel time. Usin...
详细信息
This paper studies the multivehicle task assignment problem where several dispersed vehicles need to visit a set of target locations in a time-invariant drift field while trying to minimize the total travel time. Using optimal control theory, we first design a path planning algorithm to minimize the time for each vehicle to travel between two given locations in the drift field. The path planning algorithm provides the cost matrix for the target assignment, and generates routes once the target locations are assigned to a vehicle. Then, we propose several clustering strategies to assign the targets, and we use two metrics to determine the visiting sequence of the targets clustered to each vehicle. Mainly used to specify the minimum time for a vehicle to travel between any two target locations, the cost matrix is obtained using the path planning algorithm, and is in general asymmetric due to time-invariant currents of the drift field. We show that one of the clustering strategies can obtain a min-cost arborescence of the asymmetric target-vehicle graph where the weight of a directed edge between two vertices is the minimum travel time from one vertex to the other respecting the orientation. Using tools from graph theory, a lower bound on the optimal solution is found, which can be used to measure the proximity of a solution from the optimal. Furthermore, by integrating the target clustering strategies with the target visiting metrics, we obtain several task assignment algorithms. Among them, two algorithms guarantee that all the target locations will be visited within a computable maximal travel time, which is at most twice of the optimal when the cost matrix is symmetric. Finally, numerical simulations show that the algorithms can quickly lead to a solution that is close to the optimal.
Performance of clustering algorithms is evaluated with the help of accuracy metrics. There is a great diversity of clustering algorithms, which are key components of many data analysis and exploration systems. However...
详细信息
Performance of clustering algorithms is evaluated with the help of accuracy metrics. There is a great diversity of clustering algorithms, which are key components of many data analysis and exploration systems. However, there exist only few metrics for the accuracy measurement of overlapping and multi-resolution clustering algorithms on large datasets. In this paper, we first discuss existing metrics, how they satisfy a set of formal constraints, and how they can be applied to specific cases. Then, we propose several optimizations and extensions of these metrics. More specifically, we introduce a new indexing technique to reduce both the runtime and the memory complexity of the Mean F1 score evaluation. Our technique can be applied on large datasets and it is faster on a single CPU than state-of-the-art implementations running on high-performance servers. In addition, we propose several extensions of the discussed metrics to improve their effectiveness and satisfaction to formal constraints without affecting their efficiency. All the metrics discussed in this paper are implemented in C++ and are available for free as open-source packages that can be used either as stand-alone tools or as part of a benchmarking system to compare various clustering algorithms.
Major clustering algorithms consider all data objects as good objects while dividing data-set into clusters, except some, that consider noise/outliers to some extent. As a result those algorithms are not capable to pr...
详细信息
Major clustering algorithms consider all data objects as good objects while dividing data-set into clusters, except some, that consider noise/outliers to some extent. As a result those algorithms are not capable to produce efficient clusters as there is some effect of noise on location of cluster centroids. The task of outlier identification is to find small groups of data objects that are exceptional when compared with rest large amount of data. They are not required or acceptable while dividing a data-set into clusters, as clusters refer to the similar group of data and these outliers don't belong to any of the similar group. Yet they can be important in other applications. Through this paper we are trying to prove that efficient clusters can only be produced by identifying outliers and separating them from the data-set into one cluster before applying any clustering algorithm. In this paper a density based algorithm for outlier identification is proposed. Before applying any of the clustering algorithms;proposed algorithm is applied on the data-set to identify outliers and separate them from original data-set. Proposed algorithm is applied on fuzzy clustering algorithms (FCM, PCM and PFCM). Numerical examples and tests show that fuzzy algorithms after applying proposed algorithm gives better results when compared with the performance of fuzzy clustering algorithms without applying proposed technique.
The main goal of clustering algorithms is to organize a given set of data patterns into groups (clusters) and their main strategy is to group patterns based on their similarity. However, some clustering algorithms als...
详细信息
ISBN:
(纸本)9781509035670
The main goal of clustering algorithms is to organize a given set of data patterns into groups (clusters) and their main strategy is to group patterns based on their similarity. However, some clustering algorithms also require as an input parameter, the number of clusters the induced clustering should have, or then, a threshold value used for limiting for the number of induced clusters. Both, the number of cluster as well a threshold value are often unknown, however it is well-known that results of clustering tasks can be very sensitive to them. This work presents a method for empirically estimating both values. The method is based on multiple runs of sequential clustering algorithms, by using increasing threshold values. Results from experiments conducted using several data domains from two repositories, the UCI and the Keel, as well as a few artificially created data, are presented and a comparative analysis is carried out, as evidence of the good estimates on both values given by the method.
Web documents are enormous. Text clustering is to place the documents with the most words in common into the same cluster. Thus the web search engine can structure the large result set for a certain quest. In this art...
详细信息
Web documents are enormous. Text clustering is to place the documents with the most words in common into the same cluster. Thus the web search engine can structure the large result set for a certain quest. In this article, we study three kinds of clustering algorithms, prototype based, density based and hierarchical clustering algorithms. We compare two typical algorithms, K-medoids and DBSCAN. The results show that the K-medoids is sensitive to the initial center point and the DBSCAN has a better performance.
Color quantization is an important operation with many applications in graphics and image processing. Most quantization methods are essentially based on data clustering algorithms. Recent studies have demonstrated the...
详细信息
Color quantization is an important operation with many applications in graphics and image processing. Most quantization methods are essentially based on data clustering algorithms. Recent studies have demonstrated the effectiveness of hard c-means (k-means) clustering algorithm in this domain. Other studies reported similar findings pertaining to the fuzzy c-means algorithm. Interestingly, none of these studies directly compared the two types of c-means algorithms. In this study, we implement fast and exact variants of the hard and fuzzy c-means algorithms with several initialization schemes and then compare the resulting quantizers on a diverse set of images. The results demonstrate that fuzzy c-means is significantly slower than hard c-means, and that with respect to output quality the former algorithm is neither objectively nor subjectively superior to the latter.
Inspired by the recent successes of boosting algorithms, a trend in unsupervised learning has begun to emphasize the need to explore the design of weighted clustering algorithms. We handle clustering as a constrained ...
详细信息
Inspired by the recent successes of boosting algorithms, a trend in unsupervised learning has begun to emphasize the need to explore the design of weighted clustering algorithms. We handle clustering as a constrained minimization of a Bregman divergence. Theoretical results show benefits resembling those of boosting algorithms, and bring new modified weighted versions of clustering algorithms such as k-means, expectation-maximization (EM) and k-harmonic means. Experiments display the quality of the results obtained, and corroborate the advantages that subtle data reweightings may indeed bring to clustering.
Load forecasting is one of the critical activities in electric power system planning. This paper presents clustering algorithms and their usage in load forecasting on a case study in Zagreb, Croatia. Load data acquisi...
详细信息
Load forecasting is one of the critical activities in electric power system planning. This paper presents clustering algorithms and their usage in load forecasting on a case study in Zagreb, Croatia. Load data acquisition is not always being systematically conducted in distribution networks and some data often has to be extrapolated. For such methods to work additional computation and grouping algorithms have to be used in addition to classical trend forecasting methods. Furthermore, the paper emphasizes on load forecasting in areas with no load history.
Several software clustering algorithms have been proposed in the literature, each with its own strengths and weaknesses. Most of these algorithms have been applied to particular software systems with considerable succ...
详细信息
Several software clustering algorithms have been proposed in the literature, each with its own strengths and weaknesses. Most of these algorithms have been applied to particular software systems with considerable success. However, the question of how to select a software clustering algorithm that is best suited for a specific software system remains unanswered. In this paper, we introduce a method for the selection of a software clustering algorithm for specific needs. The proposed method is based on a newly introduced formal description template for software clustering algorithms. Using the same template, we also introduce a method for software clustering algorithm improvement.
This paper presents a new approach to fuzzy clustering, which provides the basis for the development of maximum entropy clustering algorithms (MECA). The derivation of the proposed algorithms is based on an objective ...
详细信息
This paper presents a new approach to fuzzy clustering, which provides the basis for the development of maximum entropy clustering algorithms (MECA). The derivation of the proposed algorithms is based on an objective function incorporating the partition entropy and the average distortion between the prototypes and the feature vectors. This formulation allows the gradual transition from a maximum uncertainty or minimum selectivity phase to a minimum uncertainty or maximum selectivity phase during the clustering process. The application of the proposed algorithms in image compression based on vector quantization provides the basis for evaluating their computational efficiency and comparing the quality of the resulting codebook design with that provided by competing techniques.< >
暂无评论