Identifying high-value customers and providing quality services are essential for the transformation and advancement of power grid enterprises. This paper highlights the strategies necessary for achieving this transfo...
详细信息
hierarchicalclustering techniques help in building a tree-like structure called dendrogram from the data points which can be used to find the closest related data objects. This paper presents a novel hierarchical clu...
详细信息
hierarchicalclustering techniques help in building a tree-like structure called dendrogram from the data points which can be used to find the closest related data objects. This paper presents a novel hierarchicalclustering technique which considers intuitionistic fuzzy sets to deal with the uncertainty present in the data. Instead of using traditional hamming distance or Euclidean distance measure to find the distance between the data points, it employs the probabilistic Euclidean distance measure to propose a novel clustering approach which we term as 'Probabilistic Intuitionistic Fuzzy hierarchicalclustering (PIFHC) algorithm'. The proposed PIFHC algorithm considers probabilistic weights from the data to measure the distances between the data points. clustering results over UCI datasets show that our proposed PIFHC algorithm gives better cluster accuracies than its existing counterparts. PIFHC efficiently provides improvements of 1%-3.5% in the clustering accuracy compared to other fuzzy hierarchical clustering algorithms for most of the datasets. We further provide experimental results with the real-world car dataset and the Listeria monocytogenes dataset for mouse susceptibility to demonstrate the practical efficacy of the proposed algorithm. For Listeria datasets as well, proposed PIFHC records 1.7% improvement against the state-of-the-art methods The dendrograms formed by the proposed PIFHC algorithm exhibits high cophenetic correlation coefficient with an improvement of 0.75% over others. We provide various AGNES methods to update the distance between merged clusters in the proposed PIFHC algorithm. (C) 2022 Elsevier B.V. All rights reserved.
This paper focuses on performance analysis of linkage-based hierarchical agglomerative clusteringalgorithms for sequence clustering using the Kolmogrov-Smirnov distance. Data sequences are assumed to be generated fro...
详细信息
ISBN:
(纸本)9781509066315
This paper focuses on performance analysis of linkage-based hierarchical agglomerative clusteringalgorithms for sequence clustering using the Kolmogrov-Smirnov distance. Data sequences are assumed to be generated from unknown continuous distributions. The goal is to group the data sequences whose underlying generative distributions belong to one cluster without a priori knowledge of both the underlying distributions as well as the number of clusters. Upper bounds on the clustering error probability are derived. The upper bounds help establish the fact that the error probability decays exponentially fast as the sequence length goes to infinity and the obtained error exponent bound has a simple form. Tighter upper bounds on the error probability of single-linkage and complete-linkage algorithms are derived by taking advantage of the simplified metric updating for these two special cases. Simulation results are provided to validate the analysis.
The Survival of the Fittest is a principle which selects the superior and eliminates the inferior in the nature. This principle has been used in many fields, especially in optimization problem-solving. clustering in d...
详细信息
The Survival of the Fittest is a principle which selects the superior and eliminates the inferior in the nature. This principle has been used in many fields, especially in optimization problem-solving. clustering in data mining community endeavors to discover unknown representations or patterns hidden in datasets. hierarchical clustering algorithm (HCA) is a method of cluster analysis which searches the optimal distribution of clusters by a hierarchical structure. Strategies for hierarchicalclustering generally have two types: agglomerative with a bottom-up procedure and divisive with a top-down procedure. However, most of the clustering approaches have two disadvantages: the use of distance-based measurement and the difficulty of the clusters integration. In this paper, we propose an optimal probabilistic estimation (OPE) approach by exploiting the Survival of the Fittest principle. We devise a hierarchical clustering algorithm (HCA) based on OPE, also called OPE-HCA. The OPE-HCA combines optimization with probability and agglomerative HCA. Experimental results show that the OPE-HCA has the ability of searching and discovering patterns at different description levels and can also obtain better performance than many clusteringalgorithms according to NMI and clustering accuracy measures.
First, the attachment is pre-processed, and after the abnormal data are removed side by side, the bar chart is drawn to preliminarily analyze the relationship that lead-barium glass is easier to weathering than high p...
详细信息
First, the attachment is pre-processed, and after the abnormal data are removed side by side, the bar chart is drawn to preliminarily analyze the relationship that lead-barium glass is easier to weathering than high potassium glass. Then the chi-square test is carried out to find that whether the weathering of the glass cultural relics surface is related to the glass type of the cultural relics, but there is no obvious relationship with the decoration and color of the cultural relics. Secondly, the statistical model of one-way ANOVA was established, and the difference analysis of various chemical components was conducted before and after the two types of glass weathering. The chemical components that passed the difference test had significant statistical rules. Finally, multiple linear regression equations are constructed to predict the chemical composition content before weathering. The 14 chemical components of glass were regarded as 14 indicators, and the principal component analysis method was used to calculate the principal component contribution rate and the cumulative contribution rate, and then the two principal components were determined. Then the principal component analysis was used to cluster the indicators, and the hierarchical clustering algorithm was used to generate lineage maps. According to the elbow principle, the number of subcategories of high potassium glass is 3, and the number of subcategories of lead-barium glass is 4, so as to divide the chemical composition of glass. The cluster group scatter plot is then plotted and the subclass results are highly reasonable. Then a stepwise regression model is established to analyze the classification sensitivity and obtain several indicators with high sensitivity, which can be used for more targeted protection and restoration of unearthed cultural relics.
The purpose of data clusteringalgorithm is to form clusters (groups) of data points such that there is high intra-cluster and low inter-cluster similarity. There are different types of clustering methods such as hier...
详细信息
ISBN:
(纸本)9781479985623
The purpose of data clusteringalgorithm is to form clusters (groups) of data points such that there is high intra-cluster and low inter-cluster similarity. There are different types of clustering methods such as hierarchical, partitioning, grid and density based. hierarchicalclustering is a method of cluster analysis which seeks to build a hierarchy of clusters. A hierarchicalclustering method can be thought of as a set of ordinary (flat) clustering methods organized in a tree structure. These methods construct the clusters by recursively partitioning the objects in either a top-down or bottom-up fashion. In this paper we present a new hierarchical clustering algorithm using Euclidean distance. To validate this method we have performed some experiments with low dimensional artificial datasets and high dimensional fMRI dataset. Finally the result of our method is compared to some of existing clustering methods.
hierarchical clustering algorithm has low computational efficiency and error accumulation problem in iterative clustering process. To deal with the problems, we propose an improvement of hierarchicalclustering algori...
详细信息
ISBN:
(纸本)9781315685892;9781138028005
hierarchical clustering algorithm has low computational efficiency and error accumulation problem in iterative clustering process. To deal with the problems, we propose an improvement of hierarchical clustering algorithm based on GAAC (Group-average Agglomerative clustering), and the improved algorithm is applied to Chinese text clustering. The results of our experimentation show that the improved algorithm have been improved greatly in computational efficiency and the quality of clustering results.
clustering is an essential analytical tool across a wide range of scientific fields, including biology, chemistry, astronomy, and pattern recognition. This paper introduces a novel clusteringalgorithm as a competitiv...
详细信息
The purpose of data clusteringalgorithm is to form clusters (groups) of data points such that there is high intra-cluster and low inter-cluster similarity. There are different types of clustering methods such as hier...
详细信息
ISBN:
(纸本)9781479985630
The purpose of data clusteringalgorithm is to form clusters (groups) of data points such that there is high intra-cluster and low inter-cluster similarity. There are different types of clustering methods such as hierarchical, partitioning, grid and density based. hierarchicalclustering is a method of cluster analysis which seeks to build a hierarchy of clusters. A hierarchicalclustering method can be thought of as a set of ordinary (flat) clustering methods organized in a tree structure. These methods construct the clusters by recursively partitioning the objects in either a top-down or bottom-up fashion. In this paper we present a new hierarchical clustering algorithm using Euclidean distance. To validate this method we have performed some experiments with low dimensional artificial datasets and high dimensional fMRI dataset. Finally the result of our method is compared to some of existing clustering methods.
As a highly ductile concrete, engineered cementitious composites (ECC) can be used as pavement to form a lightweight composite bridge deck system. However, the structural damage introduced by fatigue load in operation...
详细信息
As a highly ductile concrete, engineered cementitious composites (ECC) can be used as pavement to form a lightweight composite bridge deck system. However, the structural damage introduced by fatigue load in operation might lead to the degradation of structural performance. In this paper, piezoelectric sensors and hierarchical clustering algorithm are used to identify structural damage of steel-ECC composite deck. First, three steel-ECC composite decks were tested under four-point loading, and the electrical impedance signals were measured. The root mean square deviation (RMSD) was extracted to quantify the structural damage severities and locations. Then the frequency interval is divided into nine sub-frequency range to employ the sensitivity analysis. On this basis, a hierarchical clustering algorithm was introduced to analyze the impedance signal to identify the damage of steel-ECC composite deck. The results show that the development of the structural damage can be continuously monitored using impedance methodology and hierarchical clustering algorithm even in the case of small unlabeled datasets.
暂无评论