Currently there is an active accumulation of big data in various information environments, such as social, corporate, scientific and other domains. Intensive use of big data in various fields stimulates the increased ...
详细信息
Currently there is an active accumulation of big data in various information environments, such as social, corporate, scientific and other domains. Intensive use of big data in various fields stimulates the increased interest of researchers to the development of methods and means of processing and analyzing massive data volumes with significant variety. One of the promising areas in data intensive analytics is cluster analysis, which allows to solve such problems as: reducing the dimension of the original dataset, identifying patterns, etc. In this article, the authors propose an ensemble of clustering algorithms, consisting of the basic algorithm K-means, characterized by one parameter - the distance metric between objects. For the evaluation of performance of the designed ensemble the open data archive of UCI was used.
Among the power system corrective controls, defensive islanding is considered as the last resort to secure the system from severe cascading contingencies. The primary motive of defensive islanding is to limit the affe...
详细信息
In this work, we develop a new method of setting the input to reservoir and reservoir to reservoir weights in echo state machines. We use a clustering technique which we have previously developed as a pre-processing s...
详细信息
This paper proposed clustering algorithms applied Gaussian basis function neural network compensator with fuzzy control for magnetic bearing system (MBS). The nonlinear MBS improved traditional bearing friction losses...
详细信息
ISBN:
(纸本)9781509024407
This paper proposed clustering algorithms applied Gaussian basis function neural network compensator with fuzzy control for magnetic bearing system (MBS). The nonlinear MBS improved traditional bearing friction losses, and nonlinear system with fuzzy controller and neural network does not require precise MBS mathematical model. We used clustering algorithms which are fuzzy c-means and k-means adjusted Gaussian basis function in neural network. Finally, we used the Lyapunov stability to guarantee MBS convergence, and the experimental results shows proposed algorithm has satisfactory performance in MBS.
Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for c...
详细信息
Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO) formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.
Wireless sensor networks (WSNs) have many applications in military services, health centers, industries as well as home surveillances. In such networks energy efficiency of nodes and life time of network are main conc...
详细信息
ISBN:
(纸本)9789380544168
Wireless sensor networks (WSNs) have many applications in military services, health centers, industries as well as home surveillances. In such networks energy efficiency of nodes and life time of network are main concerns. Different clustering approaches are used to efficiently optimize the energy of sensor nodes. clustering also improves the scalability of sensor nodes. We reviewed different approaches of clustering which are centralized, distributed and hybrid used in Sensor Networks. Recently there have been many researches on developing algorithms using equal and unequal clustering techniques. These techniques use residual energy of nodes and distance to base station as parameters for selecting cluster heads. This paper aims to examine various distributed and hybrid clustering algorithm as on date reported by different authors actively working in this area. We also briefly discuss the operations of these algorithms, as well as compare on the basis of various clustering attributes.
Data mining is the process of discovering knowledge from the vast data sources. In Data mining, classification and clustering are the two broad branches of study. In clustering, K-means algorithm is one of the bench m...
详细信息
The main goal of clustering algorithms is to organize a given set of data patterns into groups (clusters) and their main strategy is to group patterns based on their similarity. However, some clustering algorithms als...
详细信息
ISBN:
(纸本)9781509035670
The main goal of clustering algorithms is to organize a given set of data patterns into groups (clusters) and their main strategy is to group patterns based on their similarity. However, some clustering algorithms also require as an input parameter, the number of clusters the induced clustering should have, or then, a threshold value used for limiting for the number of induced clusters. Both, the number of cluster as well a threshold value are often unknown, however it is well-known that results of clustering tasks can be very sensitive to them. This work presents a method for empirically estimating both values. The method is based on multiple runs of sequential clustering algorithms, by using increasing threshold values. Results from experiments conducted using several data domains from two repositories, the UCI and the Keel, as well as a few artificially created data, are presented and a comparative analysis is carried out, as evidence of the good estimates on both values given by the method.
Data mining is the method which is useful for extracting useful information and data is extorted, but the classical data mining approaches cannot be directly used for big data due to their absolute complexity. The dat...
详细信息
ISBN:
(纸本)9781509011124
Data mining is the method which is useful for extracting useful information and data is extorted, but the classical data mining approaches cannot be directly used for big data due to their absolute complexity. The data that is been formed by numerous scientific applications and incorporated environment has grown rapidly not only in size but also in variety in recent era. The data collected is of very large amount and there is difficulty in collecting and assessing big data. clustering algorithms have developed as a powerful meta learning tool which can precisely analyze the volume of data produced by modern applications. The main goal of clustering is to categorize data into clusters such that objects are grouped in the same cluster when they are “similar” according to similarities, traits and behavior. The most commonly used algorithm in clustering are partitioning, hierarchical, grid based, density based, and model based algorithms. A review of clustering and its different techniques in data mining is done considering the criteria's for big data. Where most commonly used and effective algorithms like K-Means, FCM, BIRCH, CLIQUE algorithms are studied and compared on big data perspective.
Amount and diversity of data produced and processed has been dramatically increased parallel to improvements in technology. Unfortunately produced data usually don't have any labels which may make the classificati...
详细信息
Amount and diversity of data produced and processed has been dramatically increased parallel to improvements in technology. Unfortunately produced data usually don't have any labels which may make the classification and building information process more easily. This resulted with higher importance on data clustering for builing information. In this work K-Means, Spectral clustering and Girvan-Newman algorithms has been studied and compared on Breaast Cancer Wisconsin Data Set (BCWDS).
暂无评论