检索结果-内蒙古大学图书馆

An XGBoost-based physical fitness evaluation model using advanced feature selection and bayesian hyper-parameter optimization for wearable running monitoring

引用

COMPUTER NETWORKS 2019年第Mar.14期151卷 166-180页

作者： Guo, Junqi Yang, Lan Bie, Rongfang Yu, Jiguo Gao, Yuan Shen, Yan Kos, Anton Beijing Normal Univ Coll Informat Sci & Technol Beijing Peoples R China Qufu Normal Univ Sch Informat Sci & Engn Beijing Peoples R China Acad Mil Sci PLA Beijing Peoples R China Tsinghua Univ State Key Lab Microwave & Digital Commun Natl Lab Informat Sci & Technol Beijing Peoples R China Beijing Normal Univ Business Sch Beijing Peoples R China Univ Liubljana Fac Elect Engn Trzaska 25 Ljubljana 1000 Slovenia

Thanks to the improvement of technologies such as Internet of Things, bio-sensing and data mining, smart wearable technologies have recently received increasing attention for teenagers' sport and health monitoring. Despite the powerful data-acquisition ability of the current wearable products on the market, they still suffer performance deficiency in valuable knowledge extraction due to the lack of accurate computational model and in-depth data analysis. Based on this, this paper proposes a machine learning based physical fitness evaluation model oriented to wearable running monitoring for teenagers, in which a variant of the gradient boosting machine (GBM) combined with advanced feature selection and bayesian hyper-parameter optimization is employed to build a physical fitness evaluation model. To begin with, we design a special experimental paradigm for data acquisition based on a conventional running activity, in which a group of teenagers' photoplethysmography (PPG) signals in different testing stages are collected by a set of smartbands developed by ourselves. Next, PPG signals are processed in four steps which match with the four modules in the proposed model including signal preprocessing, physiological data estimation, feature engineering and classification modules. Firstly, the signal preprocessing module aims for suppressing noise and removing baseline drift in PPG signals by using a smoothness prior approach (SPA) and a median filter (MF), respectively. Secondly, the physiological data estimation module achieves conversion from PPG signals to physiological data such as heart rate (HR) and blood oxygen saturation (SpO(2)). Thirdly, the feature engineering module extracts from the physiological data a group of key features closely related to physical fitness statuses, and then implements a novel advanced feature selection scheme by using Pearson correlation and importance score ranking based sequential forward search (PC-ISR-SFS). Fourthly, the classificati

关键词： Smart wearables Physical fitness evaluation model PPG signal Advanced feature selection XGBoost bayesian hyper-parameter optimization

来源：评论

学校读者我要写书评

暂无评论

A boosted decision tree approach using bayesian hyper-parameter optimization for credit scoring

引用

EXPERT SYSTEMS WITH APPLICATIONS 2017年 78卷 225-241页

作者： Xia, Yufei Liu, Chuanzhe Li, YuYing Liu, Nana China Univ Min & Technol Sch Management Xuzhou 221116 Jiangsu Peoples R China China Univ Min & Technol Sch Foreign Studies Xuzhou 221116 Jiangsu Peoples R China

Credit scoring is an effective tool for banks to properly guide decision profitably on granting loans. Ensemble methods, which according to their structures can be divided into parallel and sequential ensembles, have been recently developed in the credit scoring domain. These methods have proven their superiority in discriminating borrowers accurately. However, among the ensemble models, little consideration has been provided to the following: (1) highlighting the hyper-parameter tuning of base learner despite being critical to well-performed ensemble models;(2) building sequential models (i.e., boosting, as most have focused on developing the same or different algorithms in parallel);and (3) focusing on the comprehensibility of models. This paper aims to propose a sequential ensemble credit scoring model based on a variant of gradient boosting machine (i.e., extreme gradient boosting (XGBoost)). The model mainly comprises three steps. First, data pre-processing is employed to scale the data and handle missing values. Second, a model-based feature selection system based on the relative feature importance scores is utilized to remove redundant variables. Third, the hyper-parameters of XGBoost are adaptively tuned with bayesian hyper-parameter optimization and used to train the model with selected feature subset. Several hyper-parameter optimization methods and baseline classifiers are considered as reference points in the experiment. Results demonstrate that bayesian hyper-parameter optimization performs better than random search, grid search, and manual search. Moreover, the proposed model outperforms baseline models on average over four evaluation measures: accuracy, error rate, the area under the curve (AUC) H measure (AUC-H measure), and Brier score. The proposed model also provides feature importance scores and decision chart, which enhance the interpretability of credit scoring model. (C) 2017 Elsevier Ltd. All rights reserved.

关键词： Credit scoring Boosted decision tree bayesian hyper-parameter optimization

来源：评论

学校读者我要写书评

暂无评论

A data-driven approach for optimizing the utilization of photovoltaic based water pumping systems

引用

ENERGY SYSTEMS-optimization MODELING SIMULATION AND ECONOMIC ASPECTS 2023年 1-23页

作者： Tomar, Anuradha Netaji Subhas Univ Technol Dept Instrumentat & Control Engn Sect 3 Delhi 110078 India

A photovoltaic based water pumping system (PWPS) is a promising application specifically for farmers and people living in remote or rural regions that may have limited or no access to the utility grid. However, the wider application of PWPS is limited due to the less efficient utilization of installed photovoltaic (PV) capacity, resulting in a low return on investment. Further, farmers need assistance in deciding the operational status of PWPS due to PV intermittency. Therefore, optimizing PV utilization based on farmers' irrigation and water pumping requirements is essential. In this paper, a data-driven methodology is proposed to optimize PWPS utilization and help farmers make appropriate operational decisions based on water pumping needs and available PV power. A tree ensemble supervised learning-based PV power prediction model has been developed as a first step. To enhance the performance of the PV power prediction model, a bayesian hyper-parameter optimization algorithm has been applied. During the second step, the PV power prediction outcome for the upcoming days serves as input to decide the PWPS operation in coordination with the farmer's observations regarding the water pumping needs. Based on the predicted PV power availability and irrigation/water pumping needs, the reference signal for motor pump operation would be estimated. To validate the performance of the proposed methodology, a case study has been performed, considering different operational scenarios by means of five use cases. A close match between the predicted and actual PV power generation has been observed. Better PV utilization and farm irrigation have been observed as compared to conventional PWPS. Further, the need of a long term test validation is required to analyse the stability and robustness of the developed methodology, specifically for remote/rural regions.

关键词： bayesian hyper-parameter optimization Data-driven photovoltaic (PV) water pumping systems PV irrigation PV prediction Machine learning Tree-based ensemble models

来源：评论

学校读者我要写书评

暂无评论

Optimized stacking, a new method for constructing ensemble surrogate models applied to DNAPL-contaminated aquifer remediation

引用

JOURNAL OF CONTAMINANT HYDROLOGY 2021年 243卷 103914页

作者： Shams, Reza Alimohammadi, Saeed Yazdi, Jafar Shahid Beheshti Univ Civil Water & Environm Engn Fac POB 16765-1719Bahar Blvd Tehran *** Iran

Surfactant-enhanced aquifer remediation (SEAR) is an appropriate method for DNAPL-contaminated aquifer remediation;However, due to the high cost of the SEAR method, finding the optimal remediation scenario is usually essential. Embedding numerical simulation models of DNAPL remediation within the optimization routines are computationally expensive, and in this situation, using surrogate models instead of numerical models is a proper alternative. Ensemble methods are also utilized to enhance the accuracy of surrogate models, and in this study, the Stacking ensemble method was applied and compared with conventional methods. First, Six machine learning methods were used as surrogate models, and various feature scaling techniques were employed, and their impact on the models' performance was evaluated. Also, Bagging and Boosting homogeneous ensemble methods were used to improve the base models' accuracy. A total of six stand-alone surrogate models and 12 homogeneous ensemble models were used as the base input models of the Stacking ensemble model. Due to the large size of the Stacking model, bayesian hyper-parameter optimization method was used to find its optimal hyper-parameters. The results showed that the bayesian hyper-parameter optimization method had better performance than common methods such as random search and grid search. The artificial neural network model, whose input data was scaled by the power transformer method, had the best performance with a cross-validation RMSE of 0.065. The Boosting method increased the base models' accuracy more than other homogeneous methods, and the best Boosting model had a test RMSE of 0.039. The Stacking ensemble method significantly increased the base models' accuracy and performed better than other ensemble methods. The best ensemble surrogate model constructed with Stacking had a cross-validation RMSE of 0.016. Finally, a differential evolution optimization model was used by substituting the Stacking ensemble model with t

关键词： DNAPL Surfactant-enhanced aquifer remediation Ensemble surrogate model Stacking bayesian hyper-parameter optimization Feature scaling

来源：评论

学校读者我要写书评

暂无评论

An Improved LightGBM Algorithm for Online Fault Detection of Wind Turbine Gearboxes

引用

ENERGIES 2020年第4期13卷 807-807页

作者： Tang, Mingzhu Zhao, Qi Ding, Steven X. Wu, Huawei Li, Linlin Long, Wen Huang, Bin Changsha Univ Sci & Technol Sch Energy & Power Engn Changsha 410114 Peoples R China Univ Duisburg Essen Inst Automat Control & Complex Syst AKS D-47057 Duisburg Germany Hubei Univ Arts & Sci Hubei Key Lab Power Syst Design & Test Elect Vehi Xiangyang 441053 Peoples R China Guizhou Univ Finance & Econ Guizhou Key Lab Econ Syst Simulat Guiyang 550004 Peoples R China Univ South Australia Sch Engn Adelaide SA 5095 Australia

It is widely accepted that conventional boost algorithms are of low efficiency and accuracy in dealing with big data collected from wind turbine operations. To address this issue, this paper is devoted to the application of an adaptive LightGBM method for wind turbine fault detections. To this end, the realization of feature selection for fault detection is firstly achieved by utilizing the maximum information coefficient to analyze the correlation among features in supervisory control and data acquisition (SCADA) of wind turbines. After that, a performance evaluation criterion is proposed for the improved LightGBM model to support fault detections. In this scheme, by embedding the confusion matrix as a performance indicator, an improved LightGBM fault detection approach is then developed. Based on the adaptive LightGBM fault detection model, a fault detection strategy for wind turbine gearboxes is investigated. To demonstrate the applications of the proposed algorithms and methods, a case study with a three-year SCADA dataset obtained from a wind farm sited in Southern China is conducted. Results indicate that the proposed approaches established a fault detection framework of wind turbine systems with either lower false alarm rate or lower missing detection rate.

关键词： fault diagnosis maximum information coefficient bayesian hyper-parameter optimization gradient boosting algorithm LightGBM

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：