This paper presents a new method to preprocess the datasets, and an efficient way for assigning data points to clusters. The improved method avoids k-meansalgorithm sensitive to the initial centroids and instability,...
详细信息
This paper presents a new method to preprocess the datasets, and an efficient way for assigning data points to clusters. The improved method avoids k-meansalgorithm sensitive to the initial centroids and instability, and it avoids the large number of iterations, saving the running time. Experimental results show that the improved method can effectively improve the speed of clustering and accuracy, reducing the computational complexity of the k-means.
With the gradual application of electronic transformers in the intelligent substation, the influence of the typical disturbing source on the electronic transformers is gradually to the industry's attention. In ord...
详细信息
ISBN:
(纸本)9781509050352
With the gradual application of electronic transformers in the intelligent substation, the influence of the typical disturbing source on the electronic transformers is gradually to the industry's attention. In order to ensure the accuracy and reliability of electronic transformers, it is necessary to determine the error of the electronic transformers accurately through the study of the influence factors of the measurement error of the electronic transformers. In this regard, this paper proposes a transformer error prediction method which combines RBF neural network prediction with k-means clustering algorithm. First, the historical data are classified by k-meansclustering. Then, train the prediction model with the historical data clustered. Take the influencing factors such as temperature, load capacity and magnetic field as the independent variables and the error as the prediction target to establish the prediction model. The prediction model is suitable for the analysis of a large number of data and can accurately predict the error. After using this model to analyze the long-term experimental data of Dongshan substation in 2016, the relative error of the error result is around 5%, which is relatively and verifies the validity of the model.
Nowadays, intrusion detection is a technology to effectively avoid a number of risks of network intrusion. The k-meansalgorithm is widely used in intrusion detection. But, the algorithm has some shortcomings, such as...
详细信息
ISBN:
(纸本)9781538636749
Nowadays, intrusion detection is a technology to effectively avoid a number of risks of network intrusion. The k-meansalgorithm is widely used in intrusion detection. But, the algorithm has some shortcomings, such as random selection of k value, sensitive selection of initial cluster centers, and low accuracy in clustering high-dimensional data. In order to make up for these shortcomings of the k-meansalgorithm, this paper proposes an AE-kmeans architecture which combines an autoencoder with the improved k-meansalgorithm. The AE-kmeans architecture realizes the dimension reduction and feature extraction of these original data by introducing an autoencoder, and uses the improved k-meansalgorithm to cluster these processed data. These improvements of the improved k-meansalgorithm mainly include two aspects: the first aspect, a new method is introduced to select initial cluster centers;the second aspect, the algorithm calculates the weight of each attribute with the coefficient of variation, and then uses these weights in Euclidean distance formula, resulting in a weighted Euclidean distance formula. Finally, the AE-kmeans architecture uses kDD CUP99 data set for intrusion detection simulation experiments. The Experimental results show that the AE-kmeans architecture not only enhances the ability to deal with high-dimensional data, but also improves the detection rate and reduces the error-detection rate compared with k-meansalgorithm.
This paper offers a hybrid short-term load forecasting (STLF) model using a Bayesian neural network (BNN) with a pre-processing stage consisting of a k-means clustering algorithm and time series analysis. The data clu...
详细信息
ISBN:
(纸本)9781509032709
This paper offers a hybrid short-term load forecasting (STLF) model using a Bayesian neural network (BNN) with a pre-processing stage consisting of a k-means clustering algorithm and time series analysis. The data clusters are time series analyzed to provide the most accurate data sets for each hour of the day. The final forecast is provided from the BNN output. California load data is used to determine the accuracy and processing speed of the proposed method. Additionally a comparison between BNN and other intelligent algorithms is provided using the same pre-processing stage to further gauge performance benchmarks.
According to New York Times, 5.6 million people in the United States are paralyzed to some degree. Motivated by requirements of these paralyzed patients in controlling assisted-devices that support their mobility, we ...
详细信息
According to New York Times, 5.6 million people in the United States are paralyzed to some degree. Motivated by requirements of these paralyzed patients in controlling assisted-devices that support their mobility, we present a novel EEG-based BCI system, which is composed of an Emotive EPOC neuroheadset, a laptop and a Lego Mindstorms NXT robot in this paper. We provide online learning algorithms that consist of k-meansclustering and principal component analysis to classify the signals from the headset into corresponding action commands. Moreover, we also discuss how to integrate the Emotiv EPOC headset into the system, and how to integrate the LEGO robot. Finally, we evaluate the proposed online learning algorithms of our BCI system in terms of precision, recall, and the F-measure, and our results show that the algorithms can accurately classify the subjects' thoughts into corresponding action commands.
Organizations implement management systems to achieve better performance in different functions. One important area of management systems is "energy management systems". ISO 50001: 2011 proposed a systematic...
详细信息
Organizations implement management systems to achieve better performance in different functions. One important area of management systems is "energy management systems". ISO 50001: 2011 proposed a systematic framework for operation of an effective energy management system (EnMS). Resources should be used efficiently to obtain suitable performance. Thus, it is useful to evaluate resource utilization by considering the importance and efficiency of resources in processes. In this study, major functions of an EnMS are identified in accordance with ISO 50001. Then, for each function, processes and resources are specified and classified. Later, data are gathered, and appropriate measures are calculated. Finally, data analysis techniques are applied to propose strategic decisions for improvement. In addition, an oil refinery is considered as a case study.
In this article, a new combined approach of a decision tree and clustering is presented to predict the transmission of genetic diseases. In this article, the performance of these algorithms is compared for more accura...
详细信息
In this article, a new combined approach of a decision tree and clustering is presented to predict the transmission of genetic diseases. In this article, the performance of these algorithms is compared for more accurate prediction of disease transmission under the same condition and based on a series of measures like the positive predictive value, negative predictive value, accuracy, sensitivity and specificity. The results show that support vector machine algorithm outperformed the other two simple algorithms and the neural network and genetic algorithms offered better prediction at the end, while the proposed combined approach is developed using different parameters and outperformed the simple methods.
The maximal information coefficient (MIC), a measure of dependence for two-variable relationships, can be used to discover the relationships between two variables in big data. This paper proposes a new mathematical pr...
详细信息
The maximal information coefficient (MIC), a measure of dependence for two-variable relationships, can be used to discover the relationships between two variables in big data. This paper proposes a new mathematical program model for calculating the value of MIC. A corresponding efficient algorithm is designed to solve the model in big data environment. In order to illustrate the validity of the proposed algorithm, the proposed algorithm is applied into the analysis of railway accidents data. Experimental results show that the proposed algorithm could find important relationships between two variables from big data. And some factors influencing accidents are identified from many factors. In addition, compared with the algorithm proposed by Reshef et al. in 2011, the proposed algorithm has lower time complexity and needs less computation time. Hence the proposed algorithm is more suitable for big data environment.
In this paper, we propose a new fuzzy time series forecasting method for forecasting the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) based on fuzzy time series, fuzzy logical relationships, parti...
详细信息
In this paper, we propose a new fuzzy time series forecasting method for forecasting the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) based on fuzzy time series, fuzzy logical relationships, particle swarm optimization techniques, the k-means clustering algorithm, and similarity measures between the subscript of the fuzzy set of the fuzzified historical testing datum on the previous trading day and the subscripts of the fuzzy sets appearing in the current states of the fuzzy logical relationships in the chosen fuzzy logical relationship group. The particle swarm optimization techniques are used to get the optimal partition of the intervals in the universe of discourse. The k-means clustering algorithm is used to cluster the subscripts of the fuzzy sets of the current states of the fuzzy logical relationships to get the cluster center of each cluster and to divide the constructed fuzzy logical relationships into fuzzy logical relationship groups. The experimental results show that the proposed fuzzy forecasting method gets higher forecasting accuracy rates than the existing methods. The advantages of the proposed fuzzy forecasting method is that it uses the particle swarm optimization techniques to get the optimal partition of the intervals in the universe of discourse and uses the k-means clustering algorithm to cluster the subscripts of the fuzzy sets of the current states of the fuzzy logical relationships to get the cluster center of each cluster and to divide the constructed fuzzy logical relationships into fuzzy logical relationship groups for increasing the forecasting accuracy rates. (C) 2015 Elsevier Inc. All rights reserved.
Recently, a coding scheme called vector of locally aggregated descriptors (VLAD) has got tremendous successes in large scale image retrieval due to its efficiency of compact representation. VLAD employs only the neare...
详细信息
Recently, a coding scheme called vector of locally aggregated descriptors (VLAD) has got tremendous successes in large scale image retrieval due to its efficiency of compact representation. VLAD employs only the nearest neighbor visual word in dictionary to aggregate each descriptor feature. It has fast retrieval speed and high retrieval accuracy under small dictionary size. In this paper, we give three improved VLAD variations for image classification: first, similar to the bag of words (BoW) model, we count the number of descriptors belonging to each cluster center and add it to VLAD;second, in order to expand the impact of residuals, squared residuals are taken into account;thirdly, in contrast with one nearest neighbor visual word, we try to look for two nearest neighbor visual words for aggregating each descriptor. Experimental results on UIUC Sports Event, Corel 10 and 15 Scenes datasets show that the proposed methods outperform some state-of-the-art coding schemes in terms of the classification accuracy and computation speed.
暂无评论