In practice, clustering algorithms usually suffer from the complex structure of the dataset, including data distribution and dimensionality. Meanwhile, the number of clusters, which is required as an input, is usually...
详细信息
In practice, clustering algorithms usually suffer from the complex structure of the dataset, including data distribution and dimensionality. Meanwhile, the number of clusters, which is required as an input, is usually unavailable. In this paper, we propose a novel data clustering algorithm: it uses heuristic rules based on k-nearest neighbors chain and does not require the number of clusters as the input parameter. Inspired by the PageRank algorithm, we first use random walk model to measure the importance of data points. Then, on the basis of the important data points, we build a K-Nearest Neighbors Chain (KNNC) to order the k-nearest neighbors by distance and propose two heuristic rules to find the proper number of clusters and initial clusters. The first heuristic rule is the gap of KNNC which reflects the degree of separation of clusters with convex shapes and the second one is the nearest neighbor gap of KNNC which reflects the inner compactness of a cluster. Comprehensive comparison results on synthetic and real datasets indicate that the proposed clustering algorithm can find the proper number of clusters and achieve comparable or even better performance than the popular clustering algorithms.
Hyperspectral imaging technology is used to sort varieties of seeds. However, the overall performance of prediction models decreases when they are used to test the same variety of seeds from different years or seasons...
详细信息
Hyperspectral imaging technology is used to sort varieties of seeds. However, the overall performance of prediction models decreases when they are used to test the same variety of seeds from different years or seasons. Prediction accuracy is susceptible to the influence of time and thus depends on the training set used to build the model. In this study, a model updating procedure of hyperspectral imaging data for classification of maize seeds using a clustering algorithm was proposed to maintain the accuracy and robustness of the model. A total of 2000 seeds of four typical maize varieties grown in China in three different years were used for classification based on a least-squares support vector machine classifier. After determining and applying the model parameters, the updated model achieved an overall accuracy rate of 98.3%, which is higher than the 84.6% accuracy obtained using the non-updated model. The accuracy rate of the updated model was 94.8% when testing with the Kennard-Stone algorithm, which is commonly used for selecting datasets. The proposed model updating method can successfully update seed data for cross-year model building and thus improve the overall accuracy for predicting of maize seeds harvested in different seasons.
This paper proposes a distributed group mobility adaptive (DGMA) clustering algorithm for mobile ad hoc networks (MANETs) on the basis of a revised group mobility metric, linear distance based spatial dependency (LDSD...
详细信息
This paper proposes a distributed group mobility adaptive (DGMA) clustering algorithm for mobile ad hoc networks (MANETs) on the basis of a revised group mobility metric, linear distance based spatial dependency (LDSD), which is derived from the linear distance of a node's movement instead of its instantaneous speed and direction. In particular, it is suitable for group mobility pattern where group partitions and mergence are prevalent behaviors of mobile groups. The proposed clustering scheme aims to form more stable clusters by prolonging cluster lifetime and reducing the clustering iterations even in highly dynamic environment. Simulation results show that the performance of the proposed framework is superior to two widely referenced clustering approaches, the Lowest-ID clustering scheme and the mobility based clustering algorithm MOBIC, in terms of average clusterhead lifetime, average resident time, average number of clusterhead changes, and average number of cluster reaffiliations. (C) 2008 Elsevier B.V. All rights reserved.
Cell formation plays an important role in the design of cellular manufacturing systems. Among many methods utilized to solve the cell formation problem, the similarity coefficient method is the most widely used owing ...
详细信息
Cell formation plays an important role in the design of cellular manufacturing systems. Among many methods utilized to solve the cell formation problem, the similarity coefficient method is the most widely used owing to its high flexibility and low computational requirement. However, generalized similarity coefficients ignore material flow, which is closely related to operation sequence and repeated operations. Moreover, most similarity coefficient-based clustering algorithms focus on the number of inter-cell movements but disregard distinction of the movement effort. To overcome these limitations, this study improves the generalized similarity coefficient method to form part families. In addition, a new clustering algorithm is presented to assign machines to cells with minimum intensity of material inter-cell movement, which depends on the frequency, production volume and difficulty level of inter-cell movement. Experimental results demonstrate that the proposed method has superior sensitivity and effectiveness for solving the cell formation problem.
Magnetic Resonance Imaging (MRI) is a medical imaging modality that is commonly employed for the analysis of different diseases. However, these images come with several problems such as noise and other imaging artifac...
详细信息
Magnetic Resonance Imaging (MRI) is a medical imaging modality that is commonly employed for the analysis of different diseases. However, these images come with several problems such as noise and other imaging artifacts added during acquisition process. The researchers have actual challenges for segmentation under the consideration of these effects. In medical images, a well-known clustering approach like Fuzzy C-Means widely used for segmentation. The performance of FCM algorithm is fast in noise-free images;however, this method did not consider the spatial context of the image due to which its performance suffers when images corrupted with noise and other imaging relics. In this paper, a weighted spatial Fuzzy C-Means (wsFCM) segmentation method is proposed that considered the spatial information of image. Moreover, a spatial function is also developed that integrate a membership function. In order assess this function, a neighborhood window is established around a pixel and more weights have been assigned to those pixels which have greater correlation with central pixel in local neighborhood. By integration of this spatial function in membership function, the modified membership function strengthens the original membership function in handling the noise and intensity inhomogeneity, which has the ability to preserves and maintains structural information like edges. A comprehensive set of experimentation is performed on publicly accessible simulated and real standard brain MRI datasets. The performance of the proposed method has been compared with existing state-of-the-art methods. The results show that the performance of the proposed method is better and robust in handling noise and intensity inhomogeneity than of the existing works. (C) 2019 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information, Cairo University.
In fishery aquaculture, water quality directly determines the economic benefits of aquatic products, and dissolved oxygen is an important factor affecting water quality. To accurately grasp the trends of variation in ...
详细信息
In fishery aquaculture, water quality directly determines the economic benefits of aquatic products, and dissolved oxygen is an important factor affecting water quality. To accurately grasp the trends of variation in dissolved oxygen, a dissolved oxygen concentration forecasting model based on an enhanced clustering algorithm and Adam with a radial basis function neural network (ECA-Adam-RBFNN) is proposed. An enhanced clustering algorithm (ECA) combining K-means with ant colony optimization is introduced in place of random selection to determine the center positions of the neural network hidden layer units. If the number of center points is too high, the neural network will be overfit, whereas if it is too low, sudden changes will appear in the results. Once the hidden layer centers have been determined, the radial basis function (RBF) width is calculated from the maximum center distance and the number of center points to avoid the two extreme cases of RBF that are too peaked or flat. The recursive least squares (RLS) algorithm is introduced to obtain the connection weights from the hidden layer to the output layer. The Adam algorithm is introduced to iteratively differentiate the objective function to adjust the center values, weights and width while adaptively varying the learning rates for these three types of parameters. Finally, the improved forecasting algorithm is applied for the prediction of the dissolved oxygen concentration in fishery aquaculture. The experimental results show that under identical conditions, compared with a long short-term memory (LSTM) network, a backpropagation neural network (BPNN), a traditional RBF neural network, a support vector regression (SVR) model, an autoregressive integrated moving average (ARIMA), K-MLPNN (K-means muhilayer perceptron neural networks), and SC-K-means-RBF model, the improved algorithm achieves significant reductions in the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square
Compared to hesitant fuzzy sets and intuitionistic fuzzy sets, dual hesitant fuzzy sets can model problems in the real world more comprehensively. Dual hesitant fuzzy sets explicitly show a set of membership degrees a...
详细信息
Compared to hesitant fuzzy sets and intuitionistic fuzzy sets, dual hesitant fuzzy sets can model problems in the real world more comprehensively. Dual hesitant fuzzy sets explicitly show a set of membership degrees and a set of non-membership degrees, which also imply a set of important data: hesitant degrees. The traditional definition of distance between dual hesitant fuzzy sets only considers membership degree and non-membership degree, but hesitant degree should also be taken into account. To this end, using these three important data sets (membership degree, non-membership degree and hesitant degree), we first propose a variety of new distance measurements (the generalized normalized distance, generalized normalized Hausdorff distance and generalized normalized hybrid distance) for dual hesitant fuzzy sets in this paper, based on which the corresponding similarity measurements can be obtained. In these distance definitions, membership degree, non-membership-degree and hesitant degree are of equal importance. Second, we propose a clustering algorithm by using these distances in dual hesitant fuzzy information system. Finally, a numerical example is used to illustrate the performance and effectiveness of the clustering algorithm. Accordingly, the results of clustering in dual hesitant fuzzy information system are compared using the distance measurements mentioned in the paper, which verifies the utility and advantage of our proposed distances. Our work provides a new way to improve the performance of clustering algorithms in dual hesitant fuzzy information systems.
Articulateness and plasticity are two essential attributes that make a graph as an efficient model to real life problems. Nowadays, the attributed graph is received lots of attentions because of usability and effectiv...
详细信息
Articulateness and plasticity are two essential attributes that make a graph as an efficient model to real life problems. Nowadays, the attributed graph is received lots of attentions because of usability and effectiveness. In this study, a novel k-Medoid based clustering algorithm, which focuses simultaneously on both structural and contextual aspects using Signal and the weighted Jaccard similarities, are introduced. Two real life data-sets, Political Blogs and DBLP bibliography, are employed in order to evaluate and compare the proposed algorithm with state-of-the-art clustering algorithms. The results show the superiorities of the proposed algorithm in terms of cluster quality metrics.
Keeping the digital road maps up-to-date is of critical importance, because the quality of many road-dependent services relies on it, but traditional measurement methods are still time-consuming and expensive. With th...
详细信息
Keeping the digital road maps up-to-date is of critical importance, because the quality of many road-dependent services relies on it, but traditional measurement methods are still time-consuming and expensive. With the GPS technology and wireless communication technology maturing, the positioning data from floating car become a new data source for updating road maps. The paper presents a novel incremental clustering algorithm for automatically extracting the topology of the road network employing the floating car data. A trajectory is selected as a road Link and then the remaining trajectories are added in turn until all tracks are processed. Further, the algorithm determines whether to merge the trajectory or divide it into a new Link by judging the relations of the space position between the newly added trajectory and the existing Link. A partial curve matching method based on Frechet distance is employed to measure the partial similarity between a Trajectory and a Link and the time complexity of the proposed algorithm is reduced. Experiments show that the algorithm can quickly extract the geometric shape and topology of the road network with lightweight floating car data.
The cooperative relay network exploits the space diversity gain by allowing cooperation among users to improve transmission quality. It is an important issue to identify the cluster-head (or relay node) and its member...
详细信息
The cooperative relay network exploits the space diversity gain by allowing cooperation among users to improve transmission quality. It is an important issue to identify the cluster-head (or relay node) and its members who are to cooperate. The cluster-head consumes more battery power than an ordinary node since it has extra responsibilities, i.e., ensuring the cooperation of its members' transmissions;thereby the cluster-head has a lower throughput than the average. Since users are joining or departing the clusters from time to time, the network topology is changing and the network may not be stable. Flow to balance the fairness among users and the network stability is a very interesting topic. This paper proposes an adaptive weighted clustering algorithm (AWCA), in which the weight factors are introduced to adaptively control both the stability and fairness according to the number of arrival users. It is shown that when the number of arrival users is large, AWCA has the life time longer than FWCA and similar to SWCA and that when the number of arrival users is small, AWCA provides fairness higher than SWCA and close to FWCA.
暂无评论