The advancement of data mining technology presents a way to examine and analyse the medical databases. Microarray data help in analysing the gene expressions, and the process of clustering helps in categorizing the da...
详细信息
The advancement of data mining technology presents a way to examine and analyse the medical databases. Microarray data help in analysing the gene expressions, and the process of clustering helps in categorizing the data into organized groups. Grouping similar gene expressions paves the way for effective analysis, and the relationship between the expressions can be figured out. Recognizing the benefits of clustering, this work intends to present a clustering algorithm by combining generalized hierarchical fuzzy C means (GHFCM) and grey wolf optimization (GWO) algorithms. The GWO algorithm is utilized for selecting the initial clustering point, and the GHFCM algorithm is employed for clustering the microarray gene data. The performance of the proposed clustering algorithm is tested with respect to precision, recall,F-measure and time consumption, and the results are compared with the existing approaches. The performance of the proposed work is satisfactory with betterF-measure rates and minimal time consumption.
Cell formation plays an important role in the design of cellular manufacturing systems. Among many methods utilized to solve the cell formation problem, the similarity coefficient method is the most widely used owing ...
详细信息
Cell formation plays an important role in the design of cellular manufacturing systems. Among many methods utilized to solve the cell formation problem, the similarity coefficient method is the most widely used owing to its high flexibility and low computational requirement. However, generalized similarity coefficients ignore material flow, which is closely related to operation sequence and repeated operations. Moreover, most similarity coefficient-based clustering algorithms focus on the number of inter-cell movements but disregard distinction of the movement effort. To overcome these limitations, this study improves the generalized similarity coefficient method to form part families. In addition, a new clustering algorithm is presented to assign machines to cells with minimum intensity of material inter-cell movement, which depends on the frequency, production volume and difficulty level of inter-cell movement. Experimental results demonstrate that the proposed method has superior sensitivity and effectiveness for solving the cell formation problem.
Likert scale is the most widely used psychometric scale for obtaining feedback. The major disadvantage of Likert scale is information distortion and information loss problem that arise due to its ordinal nature and cl...
详细信息
Likert scale is the most widely used psychometric scale for obtaining feedback. The major disadvantage of Likert scale is information distortion and information loss problem that arise due to its ordinal nature and closed format. Real-world responses are mostly inconsistent, imprecise and indeterminate depending on the customers' emotions. To capture the responses realistically, the concept of neutrosophy (study of neutralities and indeterminacy) is used. Indeterminate Likert scale based on neutrosophy is introduced in this paper. clustering according to customer feedback is an effective way of classifying customers and targeting them accordingly. clustering algorithm for feedback obtained using indeterminate Likert scaling is proposed in this paper. While dealing real-world scenarios, indeterminate Likert scaling is better in capturing the responses accurately.
Spectral clustering is widely used in data mining, machine learning and other fields. It can identify the arbitrary shape of a sample space and converge to the global optimal solution. Compared with the traditional k-...
详细信息
Spectral clustering is widely used in data mining, machine learning and other fields. It can identify the arbitrary shape of a sample space and converge to the global optimal solution. Compared with the traditional k-means algorithm, the spectral clustering algorithm has stronger adaptability to data and better clustering results. However, the computation of the algorithm is quite expensive. In this paper, an efficient parallel spectral clustering algorithm on multi-core processors in the Julia language is proposed, and we refer to it as juPSC. The Julia language is a high-performance, open-source programming language. The juPSC is composed of three procedures: (1) calculating the affinity matrix, (2) calculating the eigenvectors, and (3) conducting k-means clustering. Procedures (1) and (3) are computed by the efficient parallel algorithm, and the COO format is used to compress the affinity matrix. Two groups of experiments are conducted to verify the accuracy and efficiency of the juPSC. Experimental results indicate that (1) the juPSC achieves speedups of approximately 14x similar to 18x on a 24-core CPU and that (2) the serial version of the juPSC is faster than the Python version of scikit-learn. Moreover, the structure and functions of the juPSC are designed considering modularity, which is convenient for combination and further optimization with other parallel computing platforms. (C) 2020 Elsevier Inc. All rights reserved.
The rapid deployment of lithium-ion batteries in clean energy and electric vehicle applications will also increase the volume of retired batteries in the coming years. Retired Li-ion batteries could have residual capa...
详细信息
The rapid deployment of lithium-ion batteries in clean energy and electric vehicle applications will also increase the volume of retired batteries in the coming years. Retired Li-ion batteries could have residual capacities up to 70-80% of the nominal capacity of a new battery, which could be lucrative for a second-life battery market, also creating environmental and economic benefits. Presently, retired batteries are first screened to select usable batteries and then a proper secondary application is choosen according to the battery performance. Here, a complete process for grouping used batteries is proposed including safety checking, performance evaluation, data processing, and clustering of batteries. Also, a novel clustering algorithm of retired batteries based on traversal optimization is proposed. The new method does not require defining the cluster numbers and centers in beforehand, but possesses immunity to outliers. It can be used both for small and large sample sizes, as the optimization parameters used do not require iteration. The Davies-Bouldin Index of the proposed algorithm shows that the greatest differences are found between clusters, but the least differences between the samples within a single cluster, which indicates the effectiveness of the algorithm.
Environment perception is the foundation of the intelligent driving system and is a prerequisite for achieving path planning and vehicle control. Among them, obstacle detection is the key to environment perception. In...
详细信息
Environment perception is the foundation of the intelligent driving system and is a prerequisite for achieving path planning and vehicle control. Among them, obstacle detection is the key to environment perception. In order to solve the problems of difficult-to-distinguish adjacent obstacles and easy-to-split distant obstacles in the traditional obstacle detection algorithm, this study firstly designed a 3D point cloud data filtering algorithm, completed the point cloud data removal of vehicle body points and noise points, and designed the point cloud down-sampling method. Then a ground segmentation method based on the Ray Ground Filter algorithm was designed to solve the under-segmentation problem in ground segmentation, while ensuring real time. Furthermore, an improved DBSCAN (Density-Based Spatial clustering of Application with Noise) clustering algorithm was proposed, and the L-shaped fitting method was used to complete the 3D bounding box fitting of the point cloud, thus solving the problems that it is difficult to distinguish adjacent obstacles at close distances caused by the fixed parameter thresholds and it is easy for obstacles at long distances to split into multiple obstacles;thus, the real-time performance of the algorithm was improved. Finally, a real vehicle test was conducted, and the test results show that the proposed obstacle detection algorithm in this paper has improved the accuracy by 6.1% and the real-time performance by 13.2% compared with the traditional algorithm.
Unmanned aerial vehicles (UAVs) network are a very vibrant research area nowadays. They have many military and civil applications. Limited bandwidth, the high mobility and secure communication of micro UAVs represent ...
详细信息
Unmanned aerial vehicles (UAVs) network are a very vibrant research area nowadays. They have many military and civil applications. Limited bandwidth, the high mobility and secure communication of micro UAVs represent their three main problems. In this paper, we try to address these problems by means of secure clustering, and a security clustering algorithm based on integrated trust value for UAVs network is proposed. First, an improved the k-means++ algorithm is presented to determine the optimal number of clusters by the network bandwidth parameter, which ensures the optimal use of network bandwidth. Second, we considered variables representing the link expiration time to improve node clustering, and used the integrated trust value to rapidly detect malicious nodes and establish a head list. Node clustering reduce impact of high mobility and head list enhance the security of clustering algorithm. Finally, combined the remaining energy ratio, relative mobility, and the relative degrees of the nodes to select the best cluster head. The results of a simulation showed that the proposed clustering algorithm incurred a smaller computational load and higher network security.
Linguistic neutrosophic number (LNN) can describe evaluation information by three linguistic variables indicating truth-membership, indeterminacy-membership and falsity-membership respectively, which is an effective t...
详细信息
Linguistic neutrosophic number (LNN) can describe evaluation information by three linguistic variables indicating truth-membership, indeterminacy-membership and falsity-membership respectively, which is an effective tool to represent uncertainty, the partitioned Maclaurin symmetric mean (PMSM) operator can reflect the interrelationships among criteria where there are interrelationships among criteria in the same partition, but the criteria in different partitions are irrelevant, so, in this paper, we extend the PMSM operator to LNNs, define linguistic neutrosophic partitioned Maclaurin symmetric mean operator and linguistic neutrosophic weighted partitioned Maclaurin symmetric mean (LNWPMSM) operator, and discuss the properties and theorems of the proposed operators. Then we propose a clustering algorithm for linguistic neutrosophic sets based on the similarity measure to give some objective and reasonable partitions among criteria, and based on the LNWPMSM operator and the objective partition structure of the criteria, a novel multi-criteria group decision-making method is developed for linguistic neutrosophic environment. Finally, one practical example is presented to illustrate the applicability of the proposed method, and a comparison analysis is to show the advantages of the proposed method compared with the existing methods.
In the era of big data, clustering based on multi-source data fusion has become a hot topic in data mining field. Existing studies mainly focus on fusion models and algorithms of data sets in the same domain, but few ...
详细信息
In the era of big data, clustering based on multi-source data fusion has become a hot topic in data mining field. Existing studies mainly focus on fusion models and algorithms of data sets in the same domain, but few studies consider imbalanced data sets from different domains. Furthermore, studies on imbalanced data sets mostly focus on classification and less on clustering problems. Therefore, we propose a novel clustering algorithm for mining fused location data. This algorithm can deal with imbalanced data sets with large density differences, find clusters generated by the minority class data, and reduce the time complexity of the clustering process. Since current evaluation indices are not suitable for evaluating clustering results of imbalanced data sets, we present a new comprehensive evaluation metric used in the clustering validity judgment. Urban hotspots mining is used as an example, and the effectiveness of the proposed method is validated using GPS trajectory data from the transport domain and check-in data from the social network. The experimental results demonstrate that the performance of the proposed algorithm outperforms that of the state-of-the-art clustering algorithms, and it can simultaneously discover urban hotspots formed by the majority and minority class data. (c) 2021 Elsevier Inc. All rights reserved.
Metagenomic data is a novel and valuable source for personalized medicine approaches to improve human health. Data Visualization is a crucial technique in data analysis to explore and find patterns in data. Especially...
详细信息
Metagenomic data is a novel and valuable source for personalized medicine approaches to improve human health. Data Visualization is a crucial technique in data analysis to explore and find patterns in data. Especially, data resources from metagenomic often have very high dimension so humans face big challenges to understand them. In this study, we introduce a visualization method based on Mean-shift algorithm which enables us to observe high-dimensional data via images exhibiting clustered features by the clustering method. Then, these generated synthetic images are fetched into a convolutional neural network to do disease prediction tasks. The proposed method shows promising results when we evaluate the approach on four metagenomic bacterial species abundance datasets related to four diseases including Liver Cirrhosis, Colorectal Cancer, Obesity, and Type 2 Diabetes.
暂无评论