With the development and application of machine learning, significant advances have been made in landslide susceptibility mapping. However, due to challenges in actual field landslide investigations, current landslide...
详细信息
With the development and application of machine learning, significant advances have been made in landslide susceptibility mapping. However, due to challenges in actual field landslide investigations, current landslide susceptibility mapping is usually characterized by insufficient landslide samples (positive samples) and low reliability of non-landslide samples (negative samples). Considering Lianghe County in Yunnan Province, China, as an example, this paper aims to research the effectiveness of three oversampling models in generating positive samples for landslides: conditional tabular generative adversarial networks (CTGAN), generativeadversarialnetworks (GAN), and the traditional Synthetic Minority Oversampling Technique (SMOTE) algorithms. Additionally, three machine learning methods, including 1D Convolutional Neural Network-Long Short-Term Memory Neural Network (CNN-LSTM), Random Forest (RF), and Gradient Boosting Decision Tree (GBDT) classifiers, are used for landslide susceptibility assessment. We also devise a non-landslide data (negative samples) screening method utilizing a self-trained support vector machine within a semi-supervised framework. The results show that by training on the dataset after negative sample screening, the AUC values for the 1D-CNN-LSTM, RF, and GBDT models have shown significant improvement, increasing from (0.778, 0.869, 0.849) to (0.837, 0.936, 0.877). Compared with the original training set, the prediction accuracy of the three machine learning models is improved after training on the augmented data by CTGAN, GAN, and SMOTE models. The RF model, augmented with 200 positive samples generated by CTGAN, achieves the highest prediction accuracy in the study (AUC = 0.962). The 1D CNN-LSTM model achieves its highest prediction accuracy (AUC = 0.953) when augmented with 200 positive samples from GAN. Similarly, the GBDT model reaches its highest prediction accuracy (AUC = 0.928) when augmented with 200 positive samples created by SM
The bond strength between the CFRP and steel usually dominates the final strengthened effectiveness. However, the CFRP-steel bond strength is affected by various geometric and material properties and exhibits differen...
详细信息
The bond strength between the CFRP and steel usually dominates the final strengthened effectiveness. However, the CFRP-steel bond strength is affected by various geometric and material properties and exhibits different failure modes, making accurate predictions challenging. This study utilises data -driven machine learning (ML) methods to predict the strength and failure modes of CFRP-steel joints. An experimental dataset consisting of 178 single -lap shear test results was first built, after which the conditional tabular generative adversarial networks (CTGAN) method was applied to augment the limited available data. Four broadly used ML algorithms: Support Vector Machines (SVM), K -Nearest Neighbours (KNN), Decision Trees (DT) and Artificial Neural networks (ANN) were applied. The ANN regression model achieved the best performance in predicting joint strength (R-test(2) = 0.95), while the SVM classification model achieved the best performance in predicting failure modes (accuracy >= 92.3 %). The SHapley Additive exPlanations analysis further revealed that the Young's modulus of the adhesive was most significant to the joint strength, while the tensile strength of the adhesive was most significant to the failure modes. The ultimately constructed ML models and the corresponding analyses presented can benefit practical structural engineering applications and provide insights into the optimal CFRP-steel joint design.
Production environments bring inherent system challenges that are reflected in the high-dimensional production data. The data is often nonstationary, is not available in sufficient size and quality, and is class imbal...
详细信息
Production environments bring inherent system challenges that are reflected in the high-dimensional production data. The data is often nonstationary, is not available in sufficient size and quality, and is class imbalanced due to the predominance of good parts. Data-driven manufacturing analytics requires data of sufficient quantity and quality. In order to predict quality characteristics, production data is collected across processes in the industrial use case at Bosch Rexroth AG for the purpose of inferring results in hydraulic final inspection using machine learning methods. Since high quality data generation is costly, synthetic data generation methodologies offer a promising alternative to improve prediction models and thus generate safer, more accurate predictions for manufacturing companies. Among the synthetic data generation methodologies used, variational autoencoders compared to generativeadversarialnetworks and synthetic minority oversampling technique methods are best suited to synthesize the feature with highest feature importance from a small sample data set compared to the production data and improve the prediction for the target variable.
Investment in the stock market has become a trend in today’s era. The primary force moving the market in a specific direction is the large buying and selling of hedge funds, pension funds, banks, etc. This paper prop...
详细信息
To classify wart treatment methods, this research paper examines the effectiveness of using machine learning (ML) and deep learning algorithms in conjunction with numerical and image data. Human papillomavirus (HPV)-i...
详细信息
To classify wart treatment methods, this research paper examines the effectiveness of using machine learning (ML) and deep learning algorithms in conjunction with numerical and image data. Human papillomavirus (HPV)-induced warts are a common dermatological concern. Several factors can affect the severity and spread of these lesions. Making use of both picture and numerical data, the study suggests a thorough method for classifying treatments. The paper shows that the suggested methodology is effective through thorough experimentation. The study achieves remarkable classification accuracy, specifically 91.67% for cryotherapy and 85% for immunotherapy datasets, by utilising machine learning and deep learning algorithms. Notably, accuracy rates of 76% for cryotherapy and 74% for immunotherapy are obtained by combining synthetic and raw data, demonstrating the potential benefits of integrating various data sources. The study adds a comprehensive framework that makes accurate classification of wart treatments possible. The model provides a comprehensive understanding of wart types and treatment outcomes by combining image analysis with numerical data. This creative method uses both quantitative and visual data to enable users to make well-informed decisions. All things considered, the study highlights the potential of AI and ML methods to improve the classification of wart treatments, offering dermatologists and other medical professionals a useful tool. This study is a major step towards more individualised and data-driven dermatological care strategies, which could lead to better patient outcomes and more effective treatments.
Network security has become a serious issue since networks are vulnerable and subject to increasing intrusive activities. Therefore, network intrusion detection systems (IDSs) are an essential component to defend agai...
详细信息
Network security has become a serious issue since networks are vulnerable and subject to increasing intrusive activities. Therefore, network intrusion detection systems (IDSs) are an essential component to defend against these activities. One of the biggest issues encountered by IDSs is the class imbalance problem which leads to a biased performance by most machine learning models to normal activities (majority class). Several techniques were proposed to overcome the class imbalance problem such as resampling, cost-sensitive, and ensemble learning techniques. Other issues related to intrusion detection data include mixed data types, and non-Gaussian and multimodal distributions. In this study, we employed a conditionaltabulargenerativeadversarial network (CTGAN) model with common machine learning algorithms to construct more effective detection systems while addressing the imbalance issue. CTGAN can generate samples of the minority class during training to make the dataset more balanced. To assess the effectiveness of the proposed IDS, we combined CTGAN with three machine learning algorithms: support vector machine (SVM), K-nearest neighbor (KNN), and decision tree (DT). The imbalanced NSLKDD dataset was used and several experiments were conducted. The results showed that CTGAN can improve the performance of imbalance learning for intrusion detection with SVM and DT. On the other hand, KNN showed no improvement in the performance since it is less sensitive to the class imbalance problem. Moreover, the results proved that CTGAN can capture the distribution of discrete features better than continuous features.
暂无评论