Accurate prediction of soil unit weight is crucial for geotechnical engineering and soil characterization. This study leverages five advanced machinelearningalgorithms-Multi-Layer Perceptron (MLP), Random Forest (RF...
详细信息
Accurate prediction of soil unit weight is crucial for geotechnical engineering and soil characterization. This study leverages five advanced machinelearningalgorithms-Multi-Layer Perceptron (MLP), Random Forest (RF), Support Vector Regression (SVR), XGBoost, and AdaBoost with RF as a weak learner-to predict soil unit weight. Hyperparameters are optimized using randomized search cross-validation (RSCV), and model performance is evaluated using mean absolute error (MAE), root mean square error (RMSE), and R-2 metrics. The input features include soil depth (D), moisture content (MC), fine content (FC), cone tip resistance (QC), and cone local resistance (FS). An autoencoder-based feature augmentation technique is applied to enhance model performance. Before augmentation, AdaBoost with RF achieves the best performance (R-2 = 0.896), while SVR performs the worst (R-2 = 0.740). Post-augmentation, all models improve, with AdaBoost showing the highest R-2 and SVR achieving significant gains (R-2 = 0.782). SHAP analysis identifies D as the most critical feature, while QC and FS negatively impact accuracy. The results highlight AdaBoost with RF as the most effective algorithm for predicting soil unit weight, underscoring the value of feature augmentation in capturing complex patterns.
Purpose: This study aimed to optimize machinelearning (ML) models for predicting in hospital mortality in patients with ST-segment elevation acute myocardial infarction (STEMI). Patients and Methods: A total of 5708 ...
详细信息
Purpose: This study aimed to optimize machinelearning (ML) models for predicting in hospital mortality in patients with ST-segment elevation acute myocardial infarction (STEMI). Patients and Methods: A total of 5708 STEMI patients were enrolled and divided into two groups according to patients' hospital outcomes. Both groups were randomly split into a training set (75%) and a testing set (25%). Four ML models were trained with data, which applied random under-sampling (RUS). The performance of optimized ML models was evaluated with respect to accuracy, sensitivity, specificity, G-mean and AUC. Two sets of features in chronological order were considered: a full set that included all variables during hospitalization and a simplified set that only included variables prior to reperfusion therapy, and the performance of the prediction models trained with these two sets of features was compared. Results: For the comprehensive metric - G-mean, the models trained with RUS outperformed those without, 80.54% vs 23.31% on average in the full set and 75.72% vs 35.76% on average in the simplified set. For models trained with the full set, the SVM achieved the best performance with 85.62% accuracy, 84.21% sensitivity, 85.66% specificity, 84.93% G-mean and 0.919 AUC. For models trained with the simplified set, the SVM achieved 83.48% G-mean, which was comparable to the models trained using the full set. For the most critical metric - sensitivity, the SVM trained using the simplified set achieved 89.47%, which even exceed the SVM (84.21%), DT (81.58%) and RF (81.58%) trained using the full set. Conclusion: Applying RUS can improve the performance of prediction models, and the models trained with simplified set, which only included variables prior to reperfusion therapy can accurately predict high-risk patients.
The trophic state is an important factor reflecting the health state of lake ecosystems. To accurately assess the trophic state of large lakes, an integrated framework was developed by combining remote sensing data, f...
详细信息
The trophic state is an important factor reflecting the health state of lake ecosystems. To accurately assess the trophic state of large lakes, an integrated framework was developed by combining remote sensing data, field monitoring data, machinelearningalgorithms, and optimization algorithms. First, key meteorological and environmental factors from in situ monitoring were combined with remotely sensed reflectance data and statistical analysis was used to determine the main factors influencing the trophic state. Second, a trophic state index (TSI) inversion model was constructed using a machinelearningalgorithm, and this was then optimized using the sparrow search algorithm (SSA) based on a backpropagation neural network (BP-NN) to establish an SSA-BP-NN model. Third, a typical lake in China (Hongze Lake) was chosen as the case study. The application results show that, when the key environmental factors (pH, temperature, average wind speed, and sediment content) and the band combination data from Sentinel-2/MSI were used as input variables, the performance of the model was improved (R2 = 0.936, RMSE = 1.133, MAPE = 1.660%, MAD = 0.604). Compared with the performance prior to optimization (R2 = 0.834, RMSE = 1.790, MAPE = 2.679%, MAD = 1.030), the accuracy of the model was improved by 12.2%. It is worth noting that this framework could accurately identify water bodies in different trophic states. Finally, based on this framework, we mapped the spatial distribution of TSI in Hongze Lake in different seasons from 2019 to 2020 and analyzed its variation characteristics. The framework can combine regional special feature factors influenced by a complex environment with S-2/MSI data to achieve an assessment accuracy of over 90% for TSI in sensitive waters and has strong applicability and robustness.
暂无评论