We propose a method for estimating nonlinear prediction risk using a bagging algorithm that involves ensemble learning. First we estimate the probability distribution of a future state as the ensemble set obtained usi...
详细信息
We propose a method for estimating nonlinear prediction risk using a bagging algorithm that involves ensemble learning. First we estimate the probability distribution of a future state as the ensemble set obtained using bagging predictors, and consider its standard deviation as the prediction risk. We can then improve the prediction reliability by avoiding dangerous predictions if the estimated prediction risk is high. As an application of this risk reduction method, we improve the power of surrogate data tests for system identification. Low prediction accuracy and poor system identification are caused by short and noisy data, so we perform simulations using short data derived from noisy chaotic models and real systems to confirm the validity of our method. (C) 2013 Elsevier B.V. All rights reserved.
To predict the availability state of a node in a distribution network, its history trace is usually used. Sometimes, some usage behavior patterns cannot be captured precisely from the insufficient trace, which may lea...
详细信息
ISBN:
(纸本)9781479903566
To predict the availability state of a node in a distribution network, its history trace is usually used. Sometimes, some usage behavior patterns cannot be captured precisely from the insufficient trace, which may lead to unreliable predictors. In this paper, to alleviate the data sparseness problem, the nodes with the similar behaviors are clustered, and all history information in a same cluster is seen as another information source for any node in it. For each node, an N-gram model is used to train the predictor by the combination of the new source and the node's own trace. In addition, because it is hard to capture the trace of all nodes in large scale networks, such as P2P networks, a bagging based prediction algorithm is proposed, which can be applied in the distribution environment and relieve the effect of the noisy data. In our experiments, three datasets are evaluated. Results show that the prediction performance of our cluster based N-gram predictor is better than the results of several other predictors. And the bagging based prediction algorithm presents its validity in the distribution environment.
The changes occurring in the dynamics of sugar concentration in grape berries are fairly significant during maturation, whereby they are commonly used as a marker of their development. In view of the importance this p...
详细信息
The changes occurring in the dynamics of sugar concentration in grape berries are fairly significant during maturation, whereby they are commonly used as a marker of their development. In view of the importance this parameter has for wine producers, this paper designs several models for predicting the must's probable alcohol level using both meteorological variables and those specific to the vineyard. Presentation is made of a comparative analysis of learning and meta-learning algorithms for the selection of variables and the design of useful predictive models for estimating this level. The models are designed according to data gathered at different locations within the Rioja Qualified Designation of Origin (DOC Rioja, Spain) under different climate conditions, as well as involving different grape varieties. The models designed in this study provide very good results, and following their validation by experts, they have been proven to make a major contribution to decision-making in vine growing. Finally, considering the indices of analysis studied, it has been observed that the ensemble-type model based on the bagging algorithm with REPTree decision trees records the best results, with a root mean squared error (RMSE) of 8.1% and a correlation of 84.9%. (C) 2011 Elsevier B.V. All rights reserved.
bagging algorithm has been proven to be effective when dealing with on different classification problems. However, the success of bagging depends strongly on the diversity level reached by the individual classifiers o...
详细信息
ISBN:
(纸本)9781424496365
bagging algorithm has been proven to be effective when dealing with on different classification problems. However, the success of bagging depends strongly on the diversity level reached by the individual classifiers of the ensemble models. Diversity in ensemble can be obtained when the individual classifiers are built using different circumstances, such as parameter settings, training datasets and learning algorithms. This paper presents a new approach which combines these three different ways to obtain high diversity in bagging models, aiming, as a consequence, to obtain high levels of accuracy for the ensembles. In the proposed approach, in order to obtain the optimal configurations of features and classifiers in bagging models, we have applied an evolutionary approach composed of two genetic algorithm instances. In order to validate the proposed approach, experiments involving 10 classification algorithms have been conducted, applying the resulting bagging structures in 5 pattern classification datasets taken from the UCI repository. In addition, we analyze the performance of the resulting bagging structures in terms of two recently proposed diversity measures, referred to as good and bad.
Based on imbalanced data, the predictive models for 5-year survivability of breast cancer using decision tree are proposed. After data preprocessing from SEER breast cancer datasets, it is obviously that the category ...
详细信息
ISBN:
(纸本)9781424429011
Based on imbalanced data, the predictive models for 5-year survivability of breast cancer using decision tree are proposed. After data preprocessing from SEER breast cancer datasets, it is obviously that the category of data distribution is imbalanced. Under-sampling is taken to make up the disadvantage of the performance of models caused by the imbalanced data. The performance of the models is evaluated by AUC under ROC curve, accuracy, specificity and sensitivity with 10-fold stratified cross-validation. The performance of models is best while the distribution of data is approximately equal. bagging algorithm is used to build an integration decision tree model for predicting breast cancer survivability.
This paper presents a new method of eye state recognition. Firstly, it uses NTU as the input eigenvalue, which is picked up from texture character of eye images. RBF neural network is used as classifier. In order to i...
详细信息
This paper presents a new method of eye state recognition. Firstly, it uses NTU as the input eigenvalue, which is picked up from texture character of eye images. RBF neural network is used as classifier. In order to improve the precision of the RBF neural network models, bagging algorithm is used to build an integration neural network model for eye state recognition. Some experiments to make sure that the method works effective are performed.
A neural network (NN) ensemble is a very successful technique where the outputs of a set of separately trained NNs are combined to form one unified prediction. An effective ensemble should consist of a set of networks...
详细信息
A neural network (NN) ensemble is a very successful technique where the outputs of a set of separately trained NNs are combined to form one unified prediction. An effective ensemble should consist of a set of networks that are not only highly correct, but ones that make their errors on different parts of the input space as well;however, most existing techniques only indirectly address the problem of creating such a set. We present an algorithm called ADDEMUP that uses genetic algorithms to search explicitly for a highly diverse set of accurate trained networks. ADDEMUP works by first creating an initial population, then uses genetic operators to create new networks continually, keeping the set of networks that are highly accurate while disagreeing with each other as much as possible. Experiments on four real-world domains show that ADDEMUP is able to generate a set of trained networks that is more accurate than several existing ensemble approaches. Experiments also show ADDEMUP is able to incorporate prior knowledge effectively, if available, to improve the quality of its ensemble.
暂无评论