Based on the recently published point symmetry distance (PSD) measure, this paper presents a novel PSD measure, namely symmetry similarity level (SSL) operator for k-means algorithm. Our proposed modified point symmet...
详细信息
Based on the recently published point symmetry distance (PSD) measure, this paper presents a novel PSD measure, namely symmetry similarity level (SSL) operator for k-means algorithm. Our proposed modified point symmetry-based k-means (MPSk) algorithm is more robust than the previous PSkalgorithm by Su and Chou. Not only the proposed MPSkalgorithm is suitable for the symmetrical intra-clusters as the PSkalgorithm does, the proposed MPSkalgorithm is also suitable for the symmetrical inter-clusters. In addition, two speedup strategies are presented to reduce the time required in the proposed MPSkalgorithm. Experimental results demonstrate the significant execution-time improvement and the extension to the symmetrical inter-clusters of the proposed MPSkalgorithm when compared to the previous PSkalgorithm. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
There exist many problems in the credit market where we have data that needs to be classified into distinct groups. This paper will introduce a financial k-means algorithm, which based on the historical financial rati...
详细信息
ISBN:
(纸本)1845641787
There exist many problems in the credit market where we have data that needs to be classified into distinct groups. This paper will introduce a financial k-means algorithm, which based on the historical financial ratios, applies the cluster analysis technology to analyze the listed enterprises in Zhejiang province. We analyze indicators related to financial attributes and choose nine finance indicators. According to better valuation on the companies listed, we apply "trial and error" and choose four as the number of clustering. Testing shows that companies belong to cluster 2 and cluster 3 add up to 71 companies, including 87% in all. They are all companies worthy of making loans, which is inconsistent with the good economic situation of Zhejiang province. Category 4 has nine companies including 11% that are judged as high risk business. So banks should provide these customers for loans with a mortgage or guarantee.
In order to improve the traffic efficiency of official vehicles in the traffic road network, a backpressure routing control strategy for multi-commodity flow (official traffic flow) using official vehicle network envi...
详细信息
In order to improve the traffic efficiency of official vehicles in the traffic road network, a backpressure routing control strategy for multi-commodity flow (official traffic flow) using official vehicle network environmental data information is proposed. Firstly, the road network composed of official service vehicle-mounted wireless network nodes is used to collect information on road conditions and official service vehicles. In order to improve the real-time and forward-looking route control, an official service vehicle flow forecasting method is introduced to construct a virtual official service vehicle queue. A multi-commodity flow (official service vehicle flow) backpressure route method is proposed, and an official service vehicle control strategy is designed to improve the self-adaptive route of k-means algorithm. In addition, the weight of backpressure strategy is improved according to traffic pressure conditions, and the adaptability of backpressure route algorithm is improved by using optimized parameters. Finally, the simulation results show that the proposed method can effectively control traffic vehicles and improve traffic smoothness.
Fast and high quality document clustering is a crucial task in organizing information, search engine results, enhancing web crawling, and information retrieval or filtering. Recent studies have shown that the most com...
详细信息
Fast and high quality document clustering is a crucial task in organizing information, search engine results, enhancing web crawling, and information retrieval or filtering. Recent studies have shown that the most commonly used partition-based clustering algorithm, the k-means algorithm, is more suitable for large datasets. However, the k-means algorithm can generate a local optimal solution. In this paper we propose a novel Harmony k-means algorithm (HkA) that deals with document clustering based on Harmony Search (HS) optimization method. It is proved by means of finite Markov chain theory that the HkA converges to the global optimum. To demonstrate the effectiveness and speed of HkA, we have applied HkA algorithms on some standard datasets. We also compare the HkA with other meta-heuristic and model-based document clustering approaches. Experimental results reveal that the HkA algorithm converges to the best known optimum faster than other methods and the quality of clusters are comparable.
This paper describes the k-means clustering algorithm and proposes the key problems and optimization methods used by the algorithm. The algorithm is used to analyze the fault data of heavy-truck vehicles in different ...
详细信息
ISBN:
(纸本)9781728143231;9781728143224
This paper describes the k-means clustering algorithm and proposes the key problems and optimization methods used by the algorithm. The algorithm is used to analyze the fault data of heavy-truck vehicles in different driving environments, which can realize the information integration, exploratory classification and rule analysis of vehicle faults, and provide strong data support for the overall scientific evaluation and prediction of vehicles.
Recently, Su and Chou presented an efficient point symmetry-based k-means algorithm. Extending their point symmetry-based k-means algorithm, this paper presents a novel line symmetry-based k-means algorithm for cluste...
详细信息
Recently, Su and Chou presented an efficient point symmetry-based k-means algorithm. Extending their point symmetry-based k-means algorithm, this paper presents a novel line symmetry-based k-means algorithm for clustering the data set with line symmetry property. Based on some real data sets, experimental results demonstrate that our proposed line symmetry-based k-means algorithm is rather encouraging. (c) 2005 Elsevier B.V. All rights reserved.
The k-means algorithm is widely used to find correlations between data in different application domains. However, given the massive amount of data stored, known as Big Data, the need for high-speed processing to analy...
详细信息
The k-means algorithm is widely used to find correlations between data in different application domains. However, given the massive amount of data stored, known as Big Data, the need for high-speed processing to analyze data has become even more critical, especially for real-time applications. A solution that has been adopted to increase the processing speed is the use of parallel implementations on FPGA, which has proved to be more efficient than sequential systems. Hence, this paper proposes a fully parallel implementation of the k-means algorithm on FPGA to optimize the system's processing time, thus enabling real-time applications. This proposal, unlike most implementations proposed in the literature, even parallel ones, do not have sequential steps, a limiting factor of processing speed. Results related to processing time (or throughput) and FPGA area occupancy (or hardware resources) were analyzed for different parameters, reaching performances higher than 53 millions of data points processed per second. Comparisons to the state of the art are also presented, showing speedups of more than over a partially serial implementation.
Clustering of objects according to shapes is of key importance in many scientific fields. In this paper we focus on the case where the shape of an object is represented by a configuration matrix of landmarks. It is we...
详细信息
Clustering of objects according to shapes is of key importance in many scientific fields. In this paper we focus on the case where the shape of an object is represented by a configuration matrix of landmarks. It is well known that this shape space has a finite-dimensional Riemannian manifold structure (non-Euclidean) which makes it difficult to work with. Papers about clustering on this space are scarce in the literature. The basic foundation of the -meansalgorithm is the fact that the sample mean is the value that minimizes the Euclidean distance from each point to the centroid of the cluster to which it belongs, so, our idea is integrating the Procrustes type distances and Procrustes mean into the -meansalgorithm to adapt it to the shape analysis context. As far as we know, there have been just two attempts in that way. In this paper we propose to adapt the classical -means Lloyd algorithm to the context of Shape Analysis, focusing on the three dimensional case. We present a study comparing its performance with the Hartigan-Wong -meansalgorithm, one that was previously adapted to the field of Statistical Shape Analysis. We demonstrate the better performance of the Lloyd version and, finally, we propose to add a trimmed procedure. We apply both to a 3D database obtained from an anthropometric survey of the Spanish female population conducted in this country in 2006. The algorithms presented in this paper are available in the Anthropometry R package, whose most current version is always available from the Comprehensive R Archive Network.
The proposed stochastic k-means algorithm (SkA) associates a vector with a cluster according to a probability distribution, which depends on the distance between the vector and the cluster gravity centre. It is less d...
详细信息
The proposed stochastic k-means algorithm (SkA) associates a vector with a cluster according to a probability distribution, which depends on the distance between the vector and the cluster gravity centre. It is less dependent than the k-means algorithm (kMA) on the initial centre choice. It can reach local minima closer to the global minimum of a distortion measure than the kMA. It has been applied to vector quantization of speech signals. (C) 2001 Elsevier Science B.V. All rights reserved.
The research within the multicriteria classification field is mainly focused on the assignment of actions to pre-defined classes. Nevertheless the building of multicriteria categories remains a theoretical question st...
详细信息
The research within the multicriteria classification field is mainly focused on the assignment of actions to pre-defined classes. Nevertheless the building of multicriteria categories remains a theoretical question still not studied in detail. To tackle this problem, we propose an extension of the well-known k-means algorithm to the multicriteria framework. This extension relies on the definition of a multicriteria distance based on the preference structure defined by the decision maker. Thus, two alternatives will be similar if they are preferred, indifferent and incomparable to more or less the same actions. Armed with this multicriteria distance, we will be able to partition the set of alternatives into classes that are meaningful from a multicriteria perspective. Finally, the examples of the country risk problem and the diagnosis of firms will be treated to illustrate the applicability of this method. (C) 2003 Elsevier B.V. All rights reserved.
暂无评论