This paper aims to build an employee attrition classification model based on the stacking *** algorithm is applied to address the issue of data imbalance and the Randomforest feature importance ranking method is used ...
详细信息
This paper aims to build an employee attrition classification model based on the stacking *** algorithm is applied to address the issue of data imbalance and the Randomforest feature importance ranking method is used to resolve the overfitting problem after data cleaning and ***,different algorithms are used to establish classification models as control experiments,and R-squared indicators are used to ***,the stacking algorithm is used to establish the final classification *** model has practical and significant implications for both human resource management and employee attrition analysis.
Mapping crop distribution using satellite technology is an effective approach for gaining information about food production over broad, regional scales. However, crop classification in high altitude regions from satel...
详细信息
Mapping crop distribution using satellite technology is an effective approach for gaining information about food production over broad, regional scales. However, crop classification in high altitude regions from satellite platforms remains challenging, due to the spatial heterogeneity caused by the complex planting patterns. Moreover, the frequent cloud cover makes it difficult to collect time-series imagery for these regions. Thus, this study used a mosaic of single images of Gaofen-6 data to map the crop distribution in high altitude regions of Xining City and Haidong City prefectures of Qinghai Province, China. To improve the accuracy of the crop classification, random forest-recursive feature elimination (RF-RFE) was used to determine an optimal feature subset from existing spectral, texture and topographic features. Then, a two-layer stacking generalization ensemble model, incorporating Random Forest, XGBoost and AdaBoost, was trained. The results reveal that the stacking algorithm outperformed the other single classifiers, with overall accuracy higher than 85% (87.89% for the optimal feature subset and 85.38% for the original spectral band subset). In addition, the user's and pro-ducer's accuracies for wheat, rape and maize field all exceeded 90%. Elevation was the variable with the highest importance score, illustrating its importance in crop classification of high altitude regions. Overall, the frame-work, combining RF-RFE and a stacking algorithm, can improve the accuracy of the crop classification in high altitude regions.
This study aims to employ advanced machine learning techniques, particularly the stacking ensemble algorithm, to accurately classify thirteen Traditional Chinese Medicine (TCM) syndromes in primary lung cancer patient...
详细信息
Vegetation Water Content (VWC) and Soil Moisture (SM) are closely related. To overcome the limitations of the Global Navigation Satellite System Interferometry Reflectometry (GNSS-IR) for point-based monitoring, this ...
详细信息
Vegetation Water Content (VWC) and Soil Moisture (SM) are closely related. To overcome the limitations of the Global Navigation Satellite System Interferometry Reflectometry (GNSS-IR) for point-based monitoring, this research aimed at developing a synchronous retrieval method for VWC and SM by integrating GNSS-IR and multi-source data. The method incorporates band synthesis, optimized modelling, and a stacking algorithm. The model produced Normalized Microwave Reflectance Index (NMRI) and SM products consistent with MODIS NDVI, LAI, and NASA-USDA SM data in both time and space. The study provides a novel approach for the synchronous retrieval of VWC and SM and facilitates in-depth exploration of their interaction mechanisms.
With the gradual advancement of modernization, electric energy plays an increasingly important role in social construction and residents' life. In order to ensure the economic interests of the electric power depar...
详细信息
ISBN:
(数字)9781510652118
ISBN:
(纸本)9781510652118;9781510652101
With the gradual advancement of modernization, electric energy plays an increasingly important role in social construction and residents' life. In order to ensure the economic interests of the electric power department and consumers, regular accurate measurement and detection of electric energy in the substation is particularly important. This paper presents an electric energy metering and detection algorithm based on stacking algorithm.
Dissolved gas analysis is an important way to diagnose transformer faults. Compared with the method of establishing a single classifier based on artificial intelligence for diagnosis, ensemble learning (EL) can combin...
详细信息
Dissolved gas analysis is an important way to diagnose transformer faults. Compared with the method of establishing a single classifier based on artificial intelligence for diagnosis, ensemble learning (EL) can combine multiple classifiers to achieve stronger generalization ability and better diagnostic performance. But the traditional EL belongs to homogenous ensemble in which the base learners are based on the same algorithm, so this kind of EL method lacks the differences among the base learners, as well as systematic combination strategy. For this problem, in the paper the stacking ensemble strategy is applied to fault diagnosis. Multilayer perceptron, k-nearest neighbor, decision tree and support vector machine are used as component learners, and random forest algorithm is used as a combination strategy to establish a stacking diagnosis model. In addition, homogenous ensemble methods are applied to the above four algorithms. In the method, the content of five characteristic gases are taken as the input characteristic parameters. Primary diagnostic results can be obtained with each base classifier. Then the meta-learner random forest model organizes the base classifiers, and uses the primary diagnostic output as the input of the meta-learner for secondary diagnosis to get the final diagnosis. The experimental results show that the ensemble of multiple heterogeneous component learners can enhance the generalization ability of the model, and the diagnostic accuracy is better than single classifier and the homogenous ensemble classifier. (c) 2020 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.
A lithology identification method based on stacking multi-model fusion was studied, which solved the problem of poor recognition performance of traditional single machine learning models. In the experiment, logging da...
详细信息
A lithology identification method based on stacking multi-model fusion was studied, which solved the problem of poor recognition performance of traditional single machine learning models. In the experiment, logging data underwent preprocessing using outlier and linear analysis. Nine data features were filtered to identify valid features. Classification and regression tree, K-nearest neighbour algorithm, random forest, and extreme gradient boosting were used as base models. Principal component analysis calculated the weights of each model and applied them to the light gradient boosting machine metamodel in the second layer, constructing a multi-layer ensemble learning model. The fusion model improved the F1-score by 1.63 percentage points compared to random forest. In the siltstone with the best average recognition performance, the improvement was 9.24 percentage points over the K-nearest neighbour algorithm. These results verify the higher accuracy and F1-score of the fusion model as compared to traditional single algorithms, demonstrating the effectiveness of the fusion model method.
Labyrinth weirs are utilized to transport a greater discharge during floods in contrast to conventional weirs due to their increased weir crest length. Nevertheless, due to the increased geometric complexity of labyri...
详细信息
Labyrinth weirs are utilized to transport a greater discharge during floods in contrast to conventional weirs due to their increased weir crest length. Nevertheless, due to the increased geometric complexity of labyrinth weirs, determination of accurate discharge coefficients and accordingly, head-discharge ratings are quite essential issues in practical application. Hence, as a first step the present study proposes the following eight standalone algorithms: decision table (DT), Kstar, least median square (LMS), M5 prime (M5P), M5 rule (M5R), pace regression (PR), random forest (RF) and sequential minimal optimization (SMO). Then, applying the stacking (ST) algorithm, these standalone models were hybridized to predict the discharge coefficient (C-d) for sharp-crested labyrinth weirs. Potential/effective variables were constructed in the form of several independent dimensionless parameters (i.e., theta, h/W, L/B, L/h, Froude number (Fr), B/W and L/W) to predict C-d as an output. The accuracy of the developed models was examined in terms of different statistical visually based and quantitative-based error measurement criteria. The results illustrate that h/W and B/W parameters have the highest and lowest effect on the C-d prediction, respectively. According to NSE, all developed algorithms provided accurate performances, while ST-Kstar had the highest prediction power.
Determining appropriate process parameters in large-scale laser powder bed fusion(LPBF)additive manufacturing pose formidable challenges that necessitate advanced approaches to minimize trial-and-error during *** work...
详细信息
Determining appropriate process parameters in large-scale laser powder bed fusion(LPBF)additive manufacturing pose formidable challenges that necessitate advanced approaches to minimize trial-and-error during *** work proposed a data-driven approach based on stacking ensemble learning to predict the mechanical properties of Ti6Al4V alloy fabricated by large-scale LPBF for the first *** method can adapt to the complexity of large-scale LPBF data distribution and exhibits a more generalized predictive capability compared to base ***,the stacking model utilized artificial neural network(ANN),gradient boosting regressor,kernel ridge regression,and elastic net as base models,with the Lasso model serving as the *** optimization and cross-validation were utilized for model optimization and training based on a limited data set,resulting in higher predictive accuracy compared to traditional artificial neural network *** statistical analysis of the ANN and stacking models indicates that the stacking model exhibits superior performance on the test set,with a coefficient of determination value of 0.944,mean absolute percentage error of 2.51%,and root mean squared error of 27.64,surpassing that of the ANN *** statistical metrics demonstrate superiority over those obtained from the ANN *** results confirm that by integrating the base models,the stacking model exhibits superior predictive stability compared to individual base models alone,thereby providing a reliable assessment approach for predicting the mechanical properties of metal parts fabricated by the LPBF process.
Stock price index is an essential component of financial systems and indicates the economic performance in the national level. Even if a small improvement in its forecasting performance will be highly profitable and m...
详细信息
Stock price index is an essential component of financial systems and indicates the economic performance in the national level. Even if a small improvement in its forecasting performance will be highly profitable and meaningful. This manuscript input technical features together with macroeconomic indicators into an improved stacking framework for predicting the direction of the stock price index in respect of the price prevailing some time earlier, if necessary, a month. Random forest (RF), extremely randomized trees (ERT), extreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM), which pertain to the tree-based algorithms, and recurrent neural networks (RNN), bidirectional RNN, RNN with long short-term memory (LSTM) and gated recurrent unit (GRU) layer, which pertain to the deep learning algorithms, are stacked as base classifiers in the first layer. Cross-validation method is then implemented to iteratively generate the input for the second level classifier in order to prevent overfitting. In the second layer, logistic regression, as well as its regularized version, are employed as meta-classifiers to identify the unique learning pattern of the base classifiers. Empirical results over three major U.S. stock indices indicate that our improved stacking method outperforms state-of-the-art ensemble learning algorithms and deep learning models, achieving a higher level of accuracy, F-score and AUC value. Besides, another contribution in our research paper is the design of a Lasso (least absolute shrinkage and selection operator) based meta-classifier that is capable of automatically weighting/selecting the optimal base learners for the forecasting task. Our findings provide an integrated stacking framework in the financial area. (C) 2019 Elsevier B.V. All rights reserved.
暂无评论