Automatic document categorization plays a key role in the development of future interfaces for Web-based search. clustering algorithms are considered as a technology that is capable of mastering this "ad-hoc"...
详细信息
The paper concerns an open problem in the area of Content Based Image Retrieval (CBIR) and presents an original method for noisy image data sets by applying an artificial immune system model. In this regard, appropria...
详细信息
ISBN:
(纸本)9789881821034
The paper concerns an open problem in the area of Content Based Image Retrieval (CBIR) and presents an original method for noisy image data sets by applying an artificial immune system model. In this regard, appropriate feature extraction methods in addition to a beneficial similarity criterion contribute to retrieving images from a noisy data set precisely. The results show some improvement and resistance in the noise tolerance of content based image retrieval in a database of various images.
A study of three Lagrangian particle clustering methods has been conducted with application to the problem of predicting rotorcraft brownout conditions. A significant issue in such particle modeling simulations is the...
详细信息
At present, problems concerning data value still exist in the mining process of massive data analysis and research. Therefore, this paper aims to analyze and research obstacles which hamper the process of data extract...
详细信息
At present, problems concerning data value still exist in the mining process of massive data analysis and research. Therefore, this paper aims to analyze and research obstacles which hamper the process of data extraction and conduct a mining research on massive data by means of clustering algorithms. In the process of data extraction, the first step is to pretreat data, which means to classify and sum up data of the same type, and the second step is to extract valuable information of data by using clustering algorithms and put them into good use. The last thing that needs to be specified is that the clustering algorithm is not the only method for data mining. As a matter of fact, to achieve the best data mining effect, this method needs to be adopted with the combination of other algorithms in practical use.
We propose a simple tool to help the energy management of a large building stock defining clusters of buildings with the same function, setting alert thresholds for each cluster, and easily recognizing outliers. The o...
详细信息
We propose a simple tool to help the energy management of a large building stock defining clusters of buildings with the same function, setting alert thresholds for each cluster, and easily recognizing outliers. The objective is to enable a building management system to be used for detection of abnormal energy use. We start reviewing energy performance indicators, and how they feed into data visualization (DataViz) tools for a large building stock, especially for university campuses. After a brief presentation of the University of Turin's building stock which represents our case study, we perform an explorative analysis based on the Multidimensional Detective approach by Inselberg, using the Scatter Plot Matrix and the Parallel Coordinates methods. The k-means clustering algorithm is then applied on the same dataset to test the hypotheses made during the explorative analysis. Our results show that DataViz techniques provide quick and user-friendly solutions for the energy management of a large stock of buildings. In particular, they help identifying clusters of buildings and outliers and setting alert thresholds for various Energy Efficiency Indices.
A method to extract the retina characteristic points for the purpose of medical diagnosis of the human eye is presented in this research. The proposed method helps to make the primary decision about the illness faster...
详细信息
A method to extract the retina characteristic points for the purpose of medical diagnosis of the human eye is presented in this research. The proposed method helps to make the primary decision about the illness faster and can be used on mobile devices. The algorithm is mostly based on the characteristic points (the so-called minutiae). These structures are commonly used in the biometric applications for fingerprint-based people recognition. In the case of the conducted research, this trait was used to differentiate healthy eyes from unhealthy ones. The methods were evaluated by appropriately implemented algorithms, showing promising results. Each solution was created with object-oriented programming language. The accuracy of the classification (healthy versus samples with pathological changes) was evaluated using four algorithms: k-Nearest Neighbors, k-Means and Support Vector Machines (SVM) with linear and third-degree polynomial as well as our own approach based on counting the minutiae number. Performance requirements were also checked, and it was verified that the computing power of modern mobile devices is sufficient to implement the proposed solution. The highest accuracy result was equal to 96,45% and was obtained with the third-degree polynomial SVM. This was a novel approach. For comparative purposes, we also implemented currently used solutions for image analysis - deep learning (DL) and Convolution Neural Networks (CNNs). Both medical and computer science backgrounds are presented in the work with the main methodology components to include image segmentation using the Gaussian Matched Filter, binarization by Local Entropy Thresholding and classification with the previously mentioned approaches.
Cracks are one of the most common types of imperfections that can be found in concrete pavement, and they have a significant influence on the structural strength. The purpose of this study is to investigate the perfor...
详细信息
Cracks are one of the most common types of imperfections that can be found in concrete pavement, and they have a significant influence on the structural strength. The purpose of this study is to investigate the performance differences of various spatial clustering algorithms for pavement crack segmentation and to provide some reference for the work that is being done to maintain pavement currently. This is done by comparing and analyzing the performance of complex crack photos in different settings. For the purpose of evaluating how well the comparison method works, the indices of evaluation of NMI and RI have been selected. The experiment also includes a detailed analysis and comparison of the noisy photographs. According to the results of the experiments, the segmentation effect of these cluster algorithms is significantly worse after adding Gaussian noise;based on the NMI value, the mean-shift clustering algorithm has the best de-noise effect, whereas the performance of some clustering algorithms significantly decreases after adding noise.
The article presents immediate access to over fifty fundamental clustering algorithms. Additionally, access to clustering benchmark datasets published priorly as "Fundamental clustering Problems Suite" (FCPS...
详细信息
The article presents immediate access to over fifty fundamental clustering algorithms. Additionally, access to clustering benchmark datasets published priorly as "Fundamental clustering Problems Suite" (FCPS) is provided. The software library is named "FCPS", available in R on CRAN and accessible within Python. The input and output of clustering algorithms are standardized to enable users a swift execution of cluster analysis. By combining mirrored-density plots (MD plots) with statistical testing, FCPS provides a tool to investigate the cluster-tendency quickly before the cluster analysis itself. Common clustering challenges can be generated with an arbitrary sample size. Additionally, FCPS sums up 26 indicators intending to estimate the number of clusters and provides an appropriate implementation of the clustering accuracy for more than two clusters. (C) 2020 The Author(s). Published by Elsevier B.V.
As the Internet of medical Things emerge in the field of medicine, the volume of medical data is expanding rapidly and along with its variety. As such, clustering is an important procedure to mine the vast data. Many ...
详细信息
As the Internet of medical Things emerge in the field of medicine, the volume of medical data is expanding rapidly and along with its variety. As such, clustering is an important procedure to mine the vast data. Many swarm intelligence clustering algorithms, such as the particle swarm optimization (PSO), firefiy, cuckoo, and bat, have been designed, which can be parallelized to the benefit of mass data computation. However, few studies focus on the systematic analysis of the time complexities, the effect of instances (data size), attributes (dimensionality), number of clusters, and agents of these algorithms. In this paper, we performed a comparative research for the PSO, firefiy, cuckoo, and bat algorithms based on both synthetic and real medical data sets. Finally, we conclude which algorithms are effective for the medical data mining. In addition, we recommend the more suitable algorithms that have been developed recently for the different medical data to achieve the optimal clustering.
Advanced data analytics are increasingly being employed in healthcare research to improve patient classification and personalize medicinal therapies. In this paper, we focus on the critical problem of clustering elect...
详细信息
Advanced data analytics are increasingly being employed in healthcare research to improve patient classification and personalize medicinal therapies. In this paper, we focus on the critical problem of clustering electronic health record (EHR) data to enable appropriate patient categorization. In the era of personalized medicine, optimizing patient classification is critical to healthcare analytics. This research presents a comparative assessment of different clustering algorithms for Electronic Health Record (EHR) data, with the goal of improving the efficacy and productivity of patient clustering methods. Our study focuses on Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (Fuzzy TOPSIS) as a Multi-Criteria Decision-Making (MCDM) strategy, includes an in-depth assessment of eight clustering algorithms: K-Means, DBSCAN, Hierarchical clustering, Mean Shift, Affinity Propagation, Spectral clustering, Gaussian Mixture Models (GMM), as well as Self-Organizing Maps. The evaluation factors used for evaluation in this research are Cluster Quality Metrics, Scalability, Robustness to Noise, Cluster Shape and Density, Interpretability, Cluster Number, Dimensionality, and Consistency and Stability. These criteria and alternatives were chosen after conducting a thorough assessment of the literature and consulting with domain experts. All participated specialists actively engaged in the decision-making process, bringing unique insights into the best clustering algorithms for healthcare data. The results of this study illustrate each algorithm's strengths and weaknesses in the setting of patient stratification, providing insight into their performance across multiple dimensions. The fuzzy TOPSIS MCDM strategy is a reliable instrument for synthesizing expert opinions and methodically evaluating the found clustering alternatives. This study advances healthcare analytics by giving practitioners and researchers with informative perspectives on the selection of
暂无评论