Federated clustering lets multiple data owners collaborate in discovering patterns from distributed data without violating privacy requirements. The federated versions of traditional clustering algorithms proposed so ...
详细信息
In this paper, new measures—called clustering performance measures (CPMs)—for assessing the reliability of a clustering algorithm are proposed. These CPMs are defined using a validation measure, which determines how...
详细信息
In this paper, new measures—called clustering performance measures (CPMs)—for assessing the reliability of a clustering algorithm are proposed. These CPMs are defined using a validation measure, which determines how well the algorithm works with a given set of parameter values, and a repeatability measure, which is used for studying the stability of the clustering solutions and has the ability to estimate the correct number of clusters in a dataset. These proposed CPMs can be used to evaluate clustering algorithms that have a structure bias to certain types of data distribution as well as those that have no structure biases. Additionally, we propose a novel cluster validity index, V
I
index, which is able to handle non-spherical clusters. Five clustering algorithms on different types of real-world data and synthetic data are evaluated. The first dataset type refers to a communications signal dataset representing one modulation scheme under a variety of noise conditions, the second represents two breast cancer datasets, while the third type represents different synthetic datasets with arbitrarily shaped clusters. Additionally, comparisons with other methods for estimating the number of clusters indicate the applicability and reliability of the proposed cluster validity
V
I
index and repeatability measure for correct estimation of the number of clusters.
Increasing the lifespan of a group of distributed wireless sensors is one of the major challenges in research. This is especially important for distributed wireless sensor nodes used in harsh environments since it is ...
详细信息
Increasing the lifespan of a group of distributed wireless sensors is one of the major challenges in research. This is especially important for distributed wireless sensor nodes used in harsh environments since it is not feasible to replace or recharge their batteries. Thus, the popular low-energy adaptive clustering hierarchy (LEACH) algorithm uses the "computation and communication energy model" to increase the lifespan of distributed wireless sensor nodes. As an improved method, we present here that a combination of three clustering algorithms performs better than the LEACH algorithm. The clustering algorithms included in the combination are the k-means(++), k-means, and gap statistics algorithms. These three algorithms are used selectively in the following manner: the k-means C C algorithm initializes the center for the k-means algorithm, the k-means algorithm computes the optimal center of the clusters, and the gap statistics algorithm selects the optimal number of clusters in a distributed wireless sensor network. Our simulation shows that the approach of using a combination of clustering algorithms increases the lifespan of the wireless sensor nodes by 15% compared with the LEACH algorithm. This paper reports the details of the clustering algorithms selected for use in the combination approach and, based on the simulation results, compares the performance of the combination approach with that of the LEACH algorithm.
Accurate perception of the performance degradation of fuel cell is very important to detect its health ***,inconsistent operating conditions of fuel cell vehicles in the test result in errors in the *** order to obtai...
详细信息
Accurate perception of the performance degradation of fuel cell is very important to detect its health ***,inconsistent operating conditions of fuel cell vehicles in the test result in errors in the *** order to obtain a more credible degradation rate,this study proposes a novel method to classify the experimental data collected under different working conditions into similar operating conditions by using dimensionality reduction and clustering ***,the experimental data collected from fuel cell vehicles belong to high-dimensional *** projecting high-dimensional data into three-dimensional feature vector space via principal component analysis(PCA).The dimension-reduced three-dimensional feature vectors are input into the clustering algorithm,such as K-means and density-based noise application spatial clustering(DBSCAN).According to the clustering results,the fuel cell voltage data with similar operating conditions can be ***,the selected voltage data can be used to precisely represent the true performance degradation of an on-board fuel cell *** results show that the voltage using the K-means algorithm declines the fastest,followed by the DBSCAN algorithm, finally the original data, which indicates that the performance of the fuel cell actually declines faste. Early intervention can prolong its life to the greatest extent.
The emergence of the World Wide Web during the past few years has provided a medium for communicating information faster and to more people than before. The technologies used allow for the development of personalised,...
详细信息
The emergence of the World Wide Web during the past few years has provided a medium for communicating information faster and to more people than before. The technologies used allow for the development of personalised, adaptive to the users' needs, information systems. So far, the complexity of the design and implementation of Virtual Environments has restricted their usage in locally executed, stand-alone applications. In this paper we propose an architecture that permits and facilitates the dynamic, on-the-fly creation of Virtual Environments on the Web that adapt to the users' preferences and profiles. We focus on the algorithms available for creating an efficient virtual environment generation engine. We illustrate the proposed architecture with examples from a case study of a Virtual Museum.
In the field of online learning, the development of learning objects (LOs) has been increased. LOs promote reusing and referencing educational content in various learning environments. However, despite this progress, ...
详细信息
Advanced Persistent Threat (APT) attack has become one of the most complex attacks. It targets sensitive information. Many cybersecurity systems have been developed to detect the APT attack from network data traffic a...
详细信息
Advanced Persistent Threat (APT) attack has become one of the most complex attacks. It targets sensitive information. Many cybersecurity systems have been developed to detect the APT attack from network data traffic and request. However, they still need to be improved to identify this attack effectively due to its complexity and slow move. It gets access to the organizations either from an active directory or by gaining remote access, or even by targeting the Domain Name Server (DNS). Nowadays, many machine learning (ML) techniques have been implemented to detect APT attack by using the tools in the market. However, still, there are some limitations in terms of accuracy, efficiency, and effectiveness, especially the lack of labeled data to train ML methods. This paper proposes a framework to detect APT attacks using the most applicable clustering algorithms, such as the APRIORI, K-means, and Hunt's algorithm. To evaluate and compare the performance of the proposed framework, several experiments are conducted on a public dataset. The experimental results showed that the Support Vector Machine with Radial Basis Function (SVM-RBF) achieves the highest accuracy rate, reaching about 99.2%. This accurate result confirms the effectiveness of the developed framework for detecting attacks from network data traffic.
"Incremental Learning (IL)" is the niche area of "Machine Learning." It is of utmost essential to keep learning incremental for ever-increasing data from all domains for effectual decisions, predic...
详细信息
The major evolution of the semantic web has become exchanging data between applications in all domains of activities. Based on this vision, different applications in recent days, e.g. in the fields of community web po...
详细信息
Pattern Recognition and Data Mining pose several problems in which, by their inherent nature, it is considered that an object can belong to more than one class;that is, clusters can overlap each other. OClustR and DCl...
详细信息
暂无评论