Sports analytics (SA) incorporate machinelearning (ML) techniques and models for performance prediction. Researchers have previously evaluated ML models applied on a variety of basketball statistics. This paper aims ...
详细信息
Sports analytics (SA) incorporate machinelearning (ML) techniques and models for performance prediction. Researchers have previously evaluated ML models applied on a variety of basketball statistics. This paper aims to benchmark the forecasting performance of 14 ML models, based on 18 advanced basketball statistics and key performance indicators (KPIs). The models were applied on a filtered pool of 90 high-performance players. This study developed individual forecasting scenarios per player and experimented using all 14 models. The models' performance ranking was developed using a bespoke evaluation metric, called weighted average percentage error (WAPE), formulated from the weighted mean absolute percentage error (MAPE) evaluation results of each forecasted statistic and model. Moreover, we employed a comprehensive forecasting approach to improve KPI's results. Results showed that Tree-based models, namely Extra Trees, Random Forest, and Decision Tree, are the best performers in most of the forecasted performance indicators, with the best performance achieved by Extra Trees with a WAPE of 34.14%. In conclusion, we achieved a 3.6% MAPE improvement for the selected KPI with our approach on unseen data.
Ground settlement prediction for shield construction is highly important and challenging. This study introduces a machinelearning algorithm combined with finite element numerical simulation, i.e., machinelearning-fi...
详细信息
Ground settlement prediction for shield construction is highly important and challenging. This study introduces a machinelearning algorithm combined with finite element numerical simulation, i.e., machinelearning-finite element mesh optimization. For surface subsidence prediction, 16 combination models of ANN, KNN, RF and SVR were optimized by PSO, GA, BT and BO, involving raw data preprocessing, principal component analysis, hyperparameter selection and prediction accuracy evaluation. A subway shield tunneling project was analyzed, in which the meshes of finite element numerical models were discretized into different sizes from 1.0m to 2.0m. In total, 360 sets of data points were extracted from the simulation results, including stress, strain, shield jacking force, internal friction angle, cohesion force, and settlement, of which 252 data points were used as the input parameters of machinelearning model. Analysis of average error rate of finite element-machinelearning coupling models showed that the finite element model had the highest accuracy of settlement prediction when the mesh size of the finite element model was 1.4m, and the GA-SVR model had the highest accuracy and generalization ability in ground settlement prediction. This study highlights the uniqueness of machinelearning-finite element mesh optimization model in application.
In this study, a progressive optimization method combining machinelearning and optimization method is proposed and applied to seal structure design. The method is divided into two stages to select the optimal design ...
详细信息
In this study, a progressive optimization method combining machinelearning and optimization method is proposed and applied to seal structure design. The method is divided into two stages to select the optimal design gradually. So as to find the best design scheme meeting the design requirements. For optimal design, numerical calculation method is commonly used, but hard to evaluate the optimal solution. In this work, a series of numerical model considering the effect of super elastic material about O-ring study the waterproof performance behavior of a rubber seal. K nearest neighbors (KNN) of machinelearning algorithms applied to the simulation data to predict the appropriate bolt pretension classification. Furthermore, use TOPSIS method to optimize the groove depth of 30 N bolt pretension classification. By using the TOPSIS method to consider the stress of the rubber component, optimization analysis is conducted to find the optimal design. Results show that the dual optimization method can quickly predict the best design scheme. Through the experiment, a prototype test under the condition of IPX7 verify the method. The design scheme selected by this method meets the waterproof grade requirements. There are no water stains on the surface of the O-ring and inside the motor. This paper provides a fast optimization design method for the design of sealing structure.
Energy efficiency and identification of consistent energy consumption patterns are crucial for reliability of a modern power grid. In this paper, we present a datascience solution that integrates weather data with hi...
详细信息
ISBN:
(纸本)9798350351194;9798350351187
Energy efficiency and identification of consistent energy consumption patterns are crucial for reliability of a modern power grid. In this paper, we present a datascience solution that integrates weather data with historical energy consumption data for energy consumption analysis. Consequently, the solution predicts temporal energy consumption patterns via techniques like frequent pattern mining, traditional machinelearning, and deep learning. Our solution integrates meteorological and environmental conditions over time series, analyzes them, forecasts energy consumption, and examines how weather conditions affect energy usage variation. Evaluation results on a real-world dataset show that our solution identifies several distinct frequent patterns with frequent pattern mining, and it reveals a significant relationship between irradiance and energy consumption, as well as a positive correlation between temperature and energy usage. Moreover, our solution predicts and compares energy consumption for a specific year using linear regression, decision tree, random forest, and gradient boosting models with daily weather data. Additionally, we applied a long short-term memory (LSTM) model to analyze energy consumption as time-series data, uncovering patterns based on given time steps. These results demonstrate the practicality of our datascience solution for energy consumption analysis.
Over the past few years, increasingly complex machinelearning methods have been applied for various Software Engineering (SE) tasks, particularly for the important task of automated fault prediction and localization....
详细信息
ISBN:
(纸本)9798350365634
Over the past few years, increasingly complex machinelearning methods have been applied for various Software Engineering (SE) tasks, particularly for the important task of automated fault prediction and localization. It, however, becomes much more difficult for scholars to reproduce the results that are reported in the literature, especially when the applied deep learning models and the evaluation methodology are not properly documented and when code and data are not shared. Given some recent-and very worrying-findings regarding reproducibility and progress in other areas of appliedmachinelearning, this study aims to analyze to what extent the field of software engineering, in particular in the area of software fault prediction, is plagued by similar problems. We have therefore conducted a systematic review of the current literature and examined the level of reproducibility of 56 research articles that were published between 2019 and 2022 in top-tier software engineering conferences. Our analysis revealed that scholars are apparently largely aware of the reproducibility problem, and about two-thirds of the papers provide code for their proposed deep-learning models. However, it turned out that in the vast majority of cases, crucial elements for reproducibility are missing, such as the code of the compared baselines, code for data pre-processing, or code for hyperparameter tuning. In these cases, it, therefore, remains challenging to reproduce the results in the current research literature exactly. Overall, our meta-analysis, therefore, calls for improved research practices to ensure the reproducibility of machine-learning-based research.
Streaming anomaly detection in multivariate time series is an important problem relevant for automatic monitoring of various devices. This paper tackles the problem of streaming anomaly detection by extending a framew...
详细信息
ISBN:
(纸本)9798350384048;9798350384031
Streaming anomaly detection in multivariate time series is an important problem relevant for automatic monitoring of various devices. This paper tackles the problem of streaming anomaly detection by extending a framework for the purpose of incorporating model-based approaches and evaluating previously uncombined methods for a total number of 26 distinct machine-learning-based algorithms. The framework identifies four fundamental components inherent to many streaming anomaly detection algorithms and one or more methods are presented for each component. It is found that a simple and computationally less expensive strategy for detecting concept drift yields almost identical results to the "KSWIN" strategy, when applied to measuring concept drift in a training set relevant for training a machinelearning model. A secondary experiment supports the effectiveness of finetuning a machinelearning model after the detection of concept drift for the purpose of detecting anomalies.
Crop pests and diseases are treated as one of the main factors affecting food production and *** accurate detection and corresponding precision management to reduce the spread of crop diseases in time and space is an ...
详细信息
Crop pests and diseases are treated as one of the main factors affecting food production and *** accurate detection and corresponding precision management to reduce the spread of crop diseases in time and space is an important scientific issue in crop disease control *** the one hand,the development of remote sensing technology provides higher-quality data(high spectral/spatial resolution)for crop disease *** the other hand,deep learning/machinelearning algorithms also provide novel insights for crop disease *** this paper,a comprehensive review was conducted to demonstrate various remote sensing platforms(***-based,low-attitude and spaceborne scales)and popular sensors(***,multispectral and hyperspectral sensors).In addition,conventional machinelearning and deep learning algorithms applied for crop disease monitoring are also *** the end,con-sidering the crop disease early detection problem which is a challenging problem in this area,self-supervised learning is introduced to motivate future *** is envisaged that this paper has concluded the recent crop disease monitoring algorithms and provides a novel thought on crop disease early monitoring.
A data-driven framework with strong generalization capabilities is proposed to effectively extract features and easily access battery capacity. This framework can make highly accurate predictions for the battery capac...
详细信息
A data-driven framework with strong generalization capabilities is proposed to effectively extract features and easily access battery capacity. This framework can make highly accurate predictions for the battery capacities of plug-in electric vehicles. The feature extraction process is entirely based on statistics, which are always available and can be generalized to various types of battery data. An improved ampere-hour integral method can easily access battery capacity with just short-charging segments lasting 500 s. Several machine-learning models are trained to verify the framework's effectiveness, with the best model achieving a test error of 0.84 % based on leave-one-out validation. SHAP values are used to provide a reasonable interpretation of the relationships between the constructed features and model outputs. The proposed framework offers advantages such as reduced computational resources, wide generalization, and high prediction accuracy, showing great potential for battery management.
In recent decades, Cardio Vascular Diseases have been the main cause of death around the globe. It has developed into the most lethal illness not just in India but all throughout the world. Therefore, a trustworthy, a...
详细信息
ISBN:
(纸本)9783031686382;9783031686399
In recent decades, Cardio Vascular Diseases have been the main cause of death around the globe. It has developed into the most lethal illness not just in India but all throughout the world. Therefore, a trustworthy, accurate, and workable method is needed to identify these illnesses early enough for effective therapy. A number of medical datasets have been subjected to machinelearning methods and methods for automating the study of huge and complex data. Many researchers have lately applied a variety of machinelearning algorithms to help the medical community and specialists identify heart-related illnesses. This study thoroughly assesses the chosen papers and highlights gaps in the body of knowledge, making it valuable for researchers interested in using machinelearning in the medical field, especially in the area of heart disease prognosis.
This study predicted changes in carbon footprints at the city level in T & uuml;rkiye using the Bayesian method and hyper-optimized machinelearning techniques with high accuracy and analyzed their spatial distrib...
详细信息
This study predicted changes in carbon footprints at the city level in T & uuml;rkiye using the Bayesian method and hyper-optimized machinelearning techniques with high accuracy and analyzed their spatial distribution. The dataset for 80 provinces in T & uuml;rkiye was searched. However, only the parameter data affecting the carbon concentrations in 2019-2022 and the carbon concentration amount dataset for 2019 could be accessed. The available dataset concerning parameters that affect carbon concentrations obtained from 80 cities in T & uuml;rkiye for 2019 was trained using hyper-optimized machinelearning algorithms by the Bayesian technique (Ensemble Regression, Gaussian Process Regression, Gaussian Kernel Regression, Support Vector machine Regression, Linear Regression, and Binary Decision Regression). Carbon footprint values were predicted for 2019, 2020, 2021, and 2022, and performance metrics were presented. In the application, while the ensemble regression technique obtained the lowest mean squared error value (2.31) for the training data, support vector machine regression obtained the lowest mean squared error value (13.8) in the test data. The Bayesian optimization algorithm significantly improved the success of all regression techniques applied in the study. After hyperparameter optimization, the regression methods' performance improved by 0.14 on average in terms of the coefficient of determination metric, 34.5 on average in the mean squared error metric, and 0.032 on average in the mean absolute relative error. The study will assist local governments in obtaining more accurate carbon emission estimates using less data and implementing climate action plans.
暂无评论