The classification of very high-resolution satellite imagery remains a focal point in remote sensing, attracting increased attention across diverse scientific disciplines. Various classification methods, including pix...
详细信息
The classification of very high-resolution satellite imagery remains a focal point in remote sensing, attracting increased attention across diverse scientific disciplines. Various classification methods, including pixel- and object-based techniques, have been proposed, and their performances and limitations have been discussed in the literature. This paper presents a hybrid method that combines the strengths of pixel- and object-based methods in image classification to minimize errors associated with the segmentation process, particularly under-segmentation errors in object-based image analysis. The core concept behind the method lies in categorizing segmented image objects as either homogeneous or heterogeneous based on their class probability. In this process, the estimated possibilities from the object-based classification model are considered, and segments are designated as homogeneous or heterogeneous using a user-defined threshold. The object-based classification model determines the class labels for homogeneous image objects, while the heterogeneous ones, containing pixels representing different land cover classes, are classified using the pixel-based model. The performance of hybrid classification models, created by varying thresholds, is analysed using high-resolution WorldView-3 and WorldView-2 imagery and compared with pixel- and object-based classification results. For the implementation of image classification methods, Canonical Correlation Forest (CCF), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost) were employed. The findings indicated that employing the suggested hybrid strategy with a threshold value selected within a specific range (e.g. between 60% and 80%) and employing a robust classification algorithm that provides class probabilities (e.g. CCF) results in a statistically significant improvement in overall accuracy compared to pixel and object-based methods, with gains of 5% and 4%, respectively. Visual analysis of the
This study addresses a gap in research on predictive models for postpartum dyslipidemia in women with gestational diabetes mellitus (GDM). The goal was to develop a machine learning-based model to predict postpartum d...
详细信息
This study addresses a gap in research on predictive models for postpartum dyslipidemia in women with gestational diabetes mellitus (GDM). The goal was to develop a machine learning-based model to predict postpartum dyslipidemia using early pregnancy clinical data, and the model's robustness was evaluated through both internal and temporal validation. Clinical data from 15,946 pregnant women were utilized. After cleaning, the data were divided into two sets: Dataset A (n = 1,116), used for training and evaluating the model, and Dataset B (n = 707), used for temporal validation. Several machine learningalgorithms were applied, and the performance of the model was assessed with Dataset A, while Dataset B was used to validate the model across a different time period. Feature significance was evaluated through Information Value (IV), model importance analysis, and SHAP (SHapley Additive exPlanations) analysis. The results showed that among the five machine learningalgorithms tested, tree-based ensemble models, such as XGBoost, LightGBM, and Random Forest, outperformed others in predicting postpartum dyslipidemia. In Dataset A, these models achieved accuracies of 70.54%, 70.54%, and 69.64%, respectively, with AUC-ROC values of 73.10%, 71.94%, and 76.14%. Temporal validation with Dataset B indicated that XGBoost performed best, achieving an accuracy of 81.05% and an AUC-ROC of 87.92%. The predictive power of the model was strengthened by key variables such as total cholesterol, fasting glucose, triglycerides, and BMI, with total cholesterol being identified as the most important feature. Further IV and SHAP analyses confirmed the pivotal role of these variables in predicting dyslipidemia. The study concluded that the XGBoost-based predictive model for postpartum dyslipidemia in GDM showed strong and consistent performance in both internal and temporal validations. By introducing new variables, the model can identify high-risk groups during early pregnancy, supporting ea
The existing strategy for evaluating the damage condition of structures mostly focuses on feedback supplied by traditional visualmethods,which may result in an unreliable damage characterization due to inspector subje...
详细信息
The existing strategy for evaluating the damage condition of structures mostly focuses on feedback supplied by traditional visualmethods,which may result in an unreliable damage characterization due to inspector subjectivity or insufficient level of *** a result,a robust,reliable,and repeatable method of damage identification is *** learningalgorithms for identifying structural damage are evaluated in this article,which use deep convolutional neural networks,including simple averaging,integrated stacking,separate stacking,and hybridweighted averaging ensemble and differential evolution(WAE-DE)*** identification is carried out on three types of *** proposed algorithms are used to analyze the damage of 4585 structural *** effectiveness of the ensemblelearning techniques is evaluated using the confusion *** the testing dataset,the confusion matrix achieved an accuracy of 94 percent and a minimum recall of 92 percent for the best model(WAE-DE)in distinguishing damage types as flexural,shear,combined,or undamaged.
Because the proportion between the compressive strength of high-performance concrete (HPC) and its composition is highly nonlinear, more advanced regression methods are demanded to obtain better results. Super learner...
详细信息
Because the proportion between the compressive strength of high-performance concrete (HPC) and its composition is highly nonlinear, more advanced regression methods are demanded to obtain better results. Super learner models, which are based on several ensemble methods including random forest regression (RFR), an adaptive boosting (AdaBoost), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), categorical gradient Boosting (CatBoost), are used to solve this complicated problem. A grid search method is employed to determine the best set of hyper-parameters of each ensemble algorithm. Two super learner models, which combine all six models or select the top three effective ones as the base learners, are then proposed to develop an accurate approach to estimate the compressive strength of HPC. The results on four popular datasets show significant improvement of the proposed super learner models in terms of prediction accuracy. It also reveals that their trained models always perform better than other methods since their errors (MAE, MSE, RMSE) are always much lower and values of R-2 are higher than those of the previous studies. The proposed super learner models can be used to provide a reliable tool for mixture design optimization of the HPC.
This research focuses on the development and evaluation of heavy density concrete (HDC) for radiation shielding, utilizing both experimental and machine learning techniques. Various HDC specimens with different propor...
详细信息
This research focuses on the development and evaluation of heavy density concrete (HDC) for radiation shielding, utilizing both experimental and machine learning techniques. Various HDC specimens with different proportions (25 %, 50 %, 75 %, and 100 %) of grit iron aggregate replacing normal weight aggregate, in addition to control specimens with no grit iron scale aggregate, were cast. These samples were subjected to testing at temperatures ranging from room temperature to 1200 degrees C to assess properties such as compressive strength, rebound number, ultrasonic pulse velocity, density loss, mass loss, linear attenuation coefficient (LAC), mass attenuation coefficient (MAC), half-value layer (HVL), tenth-value layer (TVL), and mean free path (MFP). ensemble learning algorithms were employed using experimental data to predict compressive strength and new empirical expressions were formulated for mechanical and radiation shielding properties, including LAC, HVL, and TVL. The addition of grit iron aggregate, combined with MgO, demonstrated a significant improvement in the mechanical and radiation shielding properties of HDC. This research holds promise for applications in nuclear reactors operating at high temperatures.
Natural disasters, notably landslides, pose significant threats to communities and infrastructure. Landslide susceptibility mapping (LSM) has been globally deemed as an effective tool to mitigate such threats. In this...
详细信息
Natural disasters, notably landslides, pose significant threats to communities and infrastructure. Landslide susceptibility mapping (LSM) has been globally deemed as an effective tool to mitigate such threats. In this regard, this study considers the northern region of Pakistan, which is primarily susceptible to landslides amid rugged topography, frequent seismic events, and seasonal rainfall, to carry out LSM. To achieve this goal, this study pioneered the fusion of baseline models (logistic regression (LR), K-nearest neighbors (KNN), and support vector machine (SVM)) with ensembled algorithms (Cascade Generalization (CG), random forest (RF), Light Gradient-Boosting Machine (LightGBM), AdaBoost, Dagging, and XGBoost). With a dataset comprising 228 landslide inventory maps, this study employed a random forest classifier and a correlation-based feature selection (CFS) approach to identify the twelve most significant parameters instigating landslides. The evaluated parameters included slope angle, elevation, aspect, geological features, and proximity to faults, roads, and streams, and slope was revealed as the primary factor influencing landslide distribution, followed by aspect and rainfall with a minute margin. The models, validated with an AUC of 0.784, ACC of 0.912, and K of 0.394 for logistic regression (LR), as well as an AUC of 0.907, ACC of 0.927, and K of 0.620 for XGBoost, highlight the practical effectiveness and potency of LSM. The results revealed the superior performance of LR among the baseline models and XGBoost among the ensembles, which contributed to the development of precise LSM for the study area. LSM may serve as a valuable tool for guiding precise risk-mitigation strategies and policies in geohazard-prone regions at national and global scales.
This study presents a comparative analysis of individual and ensemble learning algorithms (ELAs) to predict the compressive strength (CS) and flexural strength (FS) of plastic concrete. Multilayer perceptron neuron ne...
详细信息
This study presents a comparative analysis of individual and ensemble learning algorithms (ELAs) to predict the compressive strength (CS) and flexural strength (FS) of plastic concrete. Multilayer perceptron neuron network (MLPNN), Support vector machine (SVM), random forest (RF), and decision tree (DT) were used as base learners, which were then combined with bagging and Adaboost methods to improve the predictive performance. In addition, gene expression programming (GEP) was used to develop computational equations that can be used to predict the CS and FS of plastic concrete. An extensive database containing 357 and 125 data points was obtained from the literature, and the eight most impactful ingredients were used in the model's development. The accuracy of all models was assessed using several statistical measures, including an error matrix, Akaike information criterion (AIC), K-fold cross-validation, and other external validation equations. Furthermore, sensitivity and SHAP analysis were performed to evaluate input variables' relative significance and impact on the anticipated CS and FS. Based on statistical measures and other validation criteria, GEP outpaces all other individual models, whereas, in ELAs, the SVR ensemble with Adaboost and RF modified with the Bagging technique demonstrated superior performance. SHapley Additive exPlanations (SHAP) and sensitivity analysis reveal that plastic, cement, water, and the age of the specimens have the highest influence, while superplasticizer has the lowest impact, which is consistent with experimental studies. Moreover, GUI and GEP-based simple mathematical correlation can enhance the practical scope of this study and be an effective tool for the pre-mix design of plastic concrete.
This paper presents a novel ensemblelearning-based framework for accurately predicting the ultimate axial compressive load-carrying capacity of S/RCFST columns, contributing to the development of resilient and sustai...
详细信息
This paper presents a novel ensemblelearning-based framework for accurately predicting the ultimate axial compressive load-carrying capacity of S/RCFST columns, contributing to the development of resilient and sustainable structural solutions. A comprehensive dataset of 932 experimental samples was compiled, encompassing key design parameters such as column dimensions, material properties, and load capacities. Five ensemble learning algorithms, Decision Tree, Adaptive Boosting, Extreme Gradient Boosting, Categorical Gradient Boosting (CatBoost), and Boosted Regression Tree, were systematically evaluated with extensive hyperparameter tuning to enhance predictive accuracy. The CatBoost model exhibited superior performance, achieving exceptional precision and robustness during the training and testing phases. Trend Consistency Analysis and Monte Carlo Simulations were integrated to validate model reliability, ensuring practical applicability. A graphical user interface was developed to facilitate seamless adoption by engineers and researchers, enabling real-time applications in building information Modeling environments. The findings underscore the potential of AI-driven predictive models in smart infrastructure development, promoting efficiency, accuracy, and sustainability in structural engineering practice.
As the requirements for the optimal control of building systems increase, the accuracy and speed of load predictions should also increase. However, the accuracy of load predictions is related to not only the predictio...
详细信息
As the requirements for the optimal control of building systems increase, the accuracy and speed of load predictions should also increase. However, the accuracy of load predictions is related to not only the prediction algorithm, but also the feature set construction. Therefore, this study develops a short-term building cooling load prediction model based on feature set construction. The impacts of four different feature set construction methods-feature extraction, correlation analysis, K-means clustering, and discrete wavelet transform (DWT)-on the prediction accuracy are compared. To ensure that the effect of the feature set construction method is universal, three different prediction algorithms are used. The influences of the sample dimension and prediction time horizon on the prediction accuracy are also analysed. The prediction model is developed based on an ensemblelearning algorithm utilising the cubist algorithm, and the performance of the prediction model is improved when DWT is used for constructing the feature set. Compared with other commonly used prediction models, the proposed model exhibits the best performance, with R-squared and CV-RMSE values of 99.8% and 1.5%, respectively.
AdaBoost is perhaps one of the most well-known ensemble learning algorithms. In simple terms, the idea in AdaBoost is to train a number of weak learners in an increamental fashion where each new learner tries to focus...
详细信息
ISBN:
(纸本)9781728110516
AdaBoost is perhaps one of the most well-known ensemble learning algorithms. In simple terms, the idea in AdaBoost is to train a number of weak learners in an increamental fashion where each new learner tries to focus more on those samples that were misclassfied by the preceding classifiers. Consequently, in the presence of noisy data samples, the new leraners will somehow memorize the data, which in turn will lead to an overfitted model. The main objective of this paper is to provide a generalized version of the AdaBoost algorithm that avoids overfitting, and performs better when the data samples are corrupted with noise. To this end, we make use of another ensemblelearning algorithm called ValidBoost [15], and introduce a mechanism to dynamically determine the thresholds for both the error rate of each classifier and the error rate in each iteration. These threshholds enable us to control the error rate of the algorithm. Experimental simulations have been made on several benchmark datasets including Web datasets such as "Website Phishing Data Set" and "Page Blocks Classification Data Set" to evaluate the performance of our proposed algorithm.
暂无评论