The objects classification problem with application of SVM algorithm is considered. The ways of training set formation for the SVM-algorithm, realizing various versions of classification decisions accounting, received...
详细信息
ISBN:
(纸本)9781467376983
The objects classification problem with application of SVM algorithm is considered. The ways of training set formation for the SVM-algorithm, realizing various versions of classification decisions accounting, received with application of fuzzy clustering algorithms, are analyzed. Use possibility of fuzzy clustering algorithms ensemble on the base of the cluster tags vectors similarity matrices for the training set forming is shown.
As one of the most widely investigated topology control mechanisms of wireless sensor networks (WSNs), the clustering algorithm provides energy efficient communications by reducing transmission overhead and enhancing ...
详细信息
ISBN:
(纸本)9781424445608
As one of the most widely investigated topology control mechanisms of wireless sensor networks (WSNs), the clustering algorithm provides energy efficient communications by reducing transmission overhead and enhancing transmission reliability. Through the previous forms of noncooperative games, the behavior of each sensor node (SN) is individual in WSNs: accordingly, it engenders uneven distribution of residual energy across SNs and expedites network partition. To balance energy consumption of SNs and increase network lifetime and stability, a cooperative game theoretic model of clustering algorithms is provided for assigning feasible allocations of energy cost. Moreover, from the outcome of this model, we propose and analyze a cooperative clustering approach for global optimization with the capacity of sensing data transmission and energy efficiency The key idea is that SNs should trade off individual cost with network-wide cost. In the algorithm, we develop conditions to form coalitions considering residual energy, transmission distance and number of SNs in a cluster adapting to various WSNs. Furthermore, we present performance evaluation and comparison of the existing clustering algorithms with our approach quantitatively with respect to network lifetime, data transmission capacity and energy efficiency. Comparing with other approaches through the simulation, our scheme can surely guarantee to prolong network lifetime and improve data transmission capacity up to 5.8% and 35.9%, respectively
Several privacy measures have been proposed in the privacy-preserving data mining literature. However, privacy measures either assume centralized data source or that no insider is going to try to infer some informatio...
详细信息
ISBN:
(纸本)9783319463490;9783319463483
Several privacy measures have been proposed in the privacy-preserving data mining literature. However, privacy measures either assume centralized data source or that no insider is going to try to infer some information. This paper presents distributed privacy measures that take into account collusion attacks and point level breaches for distributed data clustering. An analysis of representative distributed data clustering algorithms show that collusion is an important source of privacy issues and that the analyzed algorithms exhibit different vulnerabilities to collusion groups.
This research adopted constrained k-prototypes algorithm to analyze the database that contains 294 cases of fatal falls between the years 2003 and 2010 within the Taiwanese construction industry. The objective of this...
详细信息
ISBN:
(纸本)9783037854235
This research adopted constrained k-prototypes algorithm to analyze the database that contains 294 cases of fatal falls between the years 2003 and 2010 within the Taiwanese construction industry. The objective of this study is to explore the circumstances of the fatal falls. The results of the analysis indicate that the primary circumstances under which the accidents occurred include the professional category of other specialized construction businesses;hazardous mediums such as scaffolding, support frame, stairs, stairways, and the openings, etc.;the time period between 9:00 am and 11:00 am;the falling height between 0 and 10meters. Meanwhile, not adopting safe construction methods is a major indirect factor that contributes to the accidents. The fundamental factors contributing to the accidents include not clearly informing the subcontractors of the conditions of the work environment, risk factors, labor health and safety regulations as well as the necessary measures to ensure the health and safety prior to the project commences.
Most of the clustering algorithms are designed to work as a sequential algorithm that requires all data to be present, which limits the actual implementation to run on a single machine and does not support horizontal ...
详细信息
ISBN:
(纸本)9781728106854
Most of the clustering algorithms are designed to work as a sequential algorithm that requires all data to be present, which limits the actual implementation to run on a single machine and does not support horizontal scalability. This is problematic in today's context when volume of data gets larger each day and the need to process data quickly is essential. Hence, in this paper we propose a platform that allows running clustering algorithms in a distributed manner. This is achieved through splitting the data into smaller and equal partitions, and through redesigning the original clustering algorithms to allow working on a sub-set of the input data without having to interact with the processing of the rest of the input data. At the end the so-called reduce phase aggregates the partial results obtained from processing each partition and it produces the global result.
Extreme-scale computing poses a number of challenges to application performance. Developers need to study application behavior by collecting detailed information with the help of tracing toolsets to determine shortcom...
详细信息
ISBN:
(纸本)9781450326421
Extreme-scale computing poses a number of challenges to application performance. Developers need to study application behavior by collecting detailed information with the help of tracing toolsets to determine shortcomings. But not only applications are "scalability challenged", current tracing toolsets also fall short of exascale requirements for low background overhead since trace collection for each execution entity is becoming infeasible. One effective solution is to cluster processes with the same behavior into groups. Instead of collecting performance information from each individual node, this information can be collected from just a set of representative nodes. This work contributes a fast, scalable, signature-based clustering algorithm that clusters processes exhibiting similar execution behavior. Instead of prior work based on statistical clustering, our approach produces precise results nearly without loss of events or accuracy. The proposed algorithm combines low overhead at the clustering level with log(P) time complexity, and it splits the merge process to make tracing suitable for extreme-scale computing. Overall, this multi-level precise clustering based on signatures further generalizes to a novel multi-metric clustering technique with unprecedented low overhead.
In present era, data analysis plays vital role in various domains. Data clustering is a data analysis technique used for grouping of data objects based on unsupervised learning. Many clustering algorithms have been pr...
详细信息
ISBN:
(纸本)9789380544342
In present era, data analysis plays vital role in various domains. Data clustering is a data analysis technique used for grouping of data objects based on unsupervised learning. Many clustering algorithms have been proposed in the literature. Each algorithm possesses some strengths and weaknesses. Therefore, a set of clustering algorithms are appropriate for one set of application area while another set of clustering algorithm are suitable for another set of application areas. In this paper, popular traditional algorithms are discussed. A comprehensive comparative study of different clustering algorithms is presented in this paper. These clustering algorithms are compared in detail based on various parameters used in these methods.
Energy consumption affects Wireless Sensor Networks (WSNs) lifetime and may cause network degradation. Potential work has been focused on consumed energy reduction techniques. The consumed energy during communication ...
详细信息
ISBN:
(纸本)9781479985470
Energy consumption affects Wireless Sensor Networks (WSNs) lifetime and may cause network degradation. Potential work has been focused on consumed energy reduction techniques. The consumed energy during communication is affected exponentially by the distance between the communicating nodes;the more communication distance between two nodes the more energy consumed. clustering was used to help in reducing the energy consumed in the wireless data transmission. clustering gathers the nodes into groups called clusters. One node from each cluster is elected to be the cluster head (CH). Deciding the optimal number of clusters and which sensors should be CHs is a challenge problem. We presented two hybrid clustering algorithms called K-Means Particle Swarm Optimization (KPSO) and K-Means Genetic algorithms (KGAs) in [1], [2] with significant improvement over traditional Low Energy Adaptive clustering Hierarchy protocol (LEACH). Considering the various antenna patterns for WSN we were able to improve the clustering algorithm performance in energy saving. In this article, we shall review our presented algorithms and present in details the new antenna pattern design based PSO and GAs.
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath...
详细信息
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath rays. Identifying these clusters provides better spatial and temporal characteristics of the mmWave channel. This paper deals with the clustering process and its validation across a wide range of frequencies in the mmWave spectrum below 100 GHz. By way of simulations, we show that in outdoor communication scenarios clustering of received rays is influenced by the frequency of the transmitted signal. This demonstrates the sparse characteristic of the mmWave spectrum (i.e., we obtain a lower number of rays at the receiver for the same urban scenario). We use the well-known k-means clustering algorithm to group arriving rays at the receiver. The accuracy of this partitioning is studied with both cluster validity indices (CVIs) and score fusion techniques. Finally, we analyze how the clustering solution changes with narrower-beam antennas, and we provide a comparison of the cluster characteristics for different types of antennas.
Despite an increasing consensus regarding the significance of properly identifying the most suitable clustering method for a given problem, a surprising amount of educational research, including both educational data ...
详细信息
Despite an increasing consensus regarding the significance of properly identifying the most suitable clustering method for a given problem, a surprising amount of educational research, including both educational data mining (EDM) and learning analytics (LA), neglects this critical task. This shortcoming could in many cases have a negative impact on the prediction power of both the EDM and LA based approaches. To address such issues, this work proposes an evaluation approach that automatically compares several clustering methods using multiple internal and external performance measures on 9 real-world educational datasets of different sizes, created from the University of Tartu's Moodle system, to produce two-way clustering. Moreover, to investigate the possible effect of normalization on the performance of the clustering algorithms, this work performs the same experiment on a normalized version of the datasets. Since such an exhaustive evaluation includes multiple criteria, the proposed approach employs a multiple criteria decision-making method (i.e., TOPSIS) to rank the most suitable methods for each dataset. Our results reveal that the proposed approach can automatically compare the performance of the clustering methods and accordingly recommend the most suitable method for each dataset. Furthermore, our results show that in both normalized and nonnormalized datasets of different sizes with 10 features, DBSCAN and k-medoids are the best clustering methods, whereas agglomerative and spectral methods appear to be among the most stable and highly performing clustering methods for such datasets with 15 features. Regarding datasets with more than 15 features, OPTICS is among the top-ranked algorithms among the nonnormalized datasets, and k-medoids is the best among the normalized datasets. Interestingly, our findings reveal that normalization may have a negative effect on the performance of certain methods, e.g., spectral clustering and OPTICS;however, it appears to m
暂无评论