Due to its outstanding performance in both strength and durability, ascertaining the compressive strength of Ultra-High-Performance Concrete (UHPC) holds critical significance. Recent trends reveal a shift towards usi...
详细信息
Due to its outstanding performance in both strength and durability, ascertaining the compressive strength of Ultra-High-Performance Concrete (UHPC) holds critical significance. Recent trends reveal a shift towards using various Machine Learning (ML) and Deep Learning (DL) techniques to predict UHPC's compressive strength using multi-study datasets, though not all models effectively handle these diverse and heterogeneous datasets. To address this challenge, this study proposes a Natural Gradient Boosting (NGBoost) method optimized with Dynamic sequential model-based optimization (DSMBO), leveraging a probabilistic approach to manage heterogeneous datasets. A dataset of 920 instances of compressive strength of UHPC from various studies was compiled and utilized in this research, considering 14 input features. A thorough comparative and sensitivity analysis was undertaken to evaluate NGBoost's performance with different base models and sequential model-based optimization techniques, culminating in the identification of the best-performing model, which is then benchmarked against a spectrum of ML and DL models. The findings reveal that the proposed model significantly outperforms other models, yielding R2 of 0.9748 and RMSE of 3.77 MPa. More importantly, this paper provides detailed insights into the contribution of each feature to model predictions through SHAP (SHapley Additive exPlanations) values for easy user selection. Additionally, a user-friendly web-based interface was developed to promote the practical use of the proposed model in predicting UHPC's compressive strength.
Natural language processing (NLP) aims to analyze a large amount of natural language data. The NLP computes textual data via a set of data processing elements which is sequentially connected to a path data pipeline. S...
详细信息
ISBN:
(纸本)9783030732790;9783030732806
Natural language processing (NLP) aims to analyze a large amount of natural language data. The NLP computes textual data via a set of data processing elements which is sequentially connected to a path data pipeline. Several data pipelines exist for a given set of textual data with various degrees of model accuracy. Instead of trying all the possible paths, such as random search or grid search to find an optimal path, we utilized the Bayesian optimization to search along with the space of hyper-parameters learning. In this study, we proposed a data pipeline selection for NLP using sequential model-based optimization (SMBO). We implemented the SMBO for the NLP data pipeline using Hyperparameter optimization (Hyperopt) library with Tree of Parzen Estimators (TPE) model and Adaptive Tree of Parzen Estimators (A-TPE) model for a surface model with expected improvement (EI) acquired function.
New methods to perform time series classification arise frequently and multiple state-of-the-art approaches achieve high performance on benchmark datasets with respect to accuracy and computation time. However, often ...
详细信息
New methods to perform time series classification arise frequently and multiple state-of-the-art approaches achieve high performance on benchmark datasets with respect to accuracy and computation time. However, often the modeling procedures do not include proper validation but rather rely only on either external test dataset or one-level cross-validation. ATSC-NEX is an automated procedure that employs sequential model-based optimization together with nested cross-validation to build an accurate and properly validated time series classification model. It aims to find an optimal pipeline configuration that includes the selection of input type and settings, as well as model type and hyperparameters. The results of a case study in which a model for the identification of diesel engine type is developed, show that the algorithm can efficiently find a well-performing pipeline configuration. The comparison between ATSC-NEX and some state-of-the-art methods on several benchmark datasets shows that similar accuracy can be achieved.
sequential model-based optimization (SMBO) approaches are algorithms for solving problems that require computationally or otherwise expensive function evaluations. The key design principle of SMBO is a substitution of...
详细信息
ISBN:
(纸本)9781450371285
sequential model-based optimization (SMBO) approaches are algorithms for solving problems that require computationally or otherwise expensive function evaluations. The key design principle of SMBO is a substitution of the true objective function by a surrogate, which is used to propose the point(s) to be evaluated next. SMBO algorithms are intrinsically modular, leaving the user with many important design choices. Significant research efforts go into understanding which settings perform best for which type of problems. Most works, however, focus on the choice of the model, the acquisition function, and the strategy used to optimize the latter. The choice of the initial sampling strategy, however, receives much less attention. Not surprisingly, quite diverging recommendations can be found in the literature. We analyze in this work how the size and the distribution of the initial sample influences the overall quality of the efficient global optimization (EGO) algorithm, a well-known SMBO approach. While, overall, small initial budgets using Halton sampling seem preferable, we also observe that the performance landscape is rather unstructured. We furthermore identify several situations in which EGO performs unfavorably against random sampling. Both observations indicate that an adaptive SMBO design could be beneficial, making SMBO an interesting test-bed for automated algorithm design.
Quantifying the wettability of shales is important for reservoir exploration and evaluation, as well as CO2 storage. Conventional experimental measurements are time-consuming and costly, while novel machine learning (...
详细信息
Quantifying the wettability of shales is important for reservoir exploration and evaluation, as well as CO2 storage. Conventional experimental measurements are time-consuming and costly, while novel machine learning (ML) algorithms are challenged to predict shale wettability with accuracy and generalizability. Therefore, this study introduces an advanced shale wettability prediction framework by fusing ensemble learning algorithm and automatic hyperparameters optimization schemes (AHOS). Specifically, we firstly collected data from various gas geo-storage conditions from the literature. Then, an ensemble learning algorithm was utilized to extract complex nonlinear relationships among influential variables and shale wettability. During the training process, multiple optimization schemes, including random search (RS), Bayesian optimization (BO), and sequential model-based optimization (SMBO), were used and compared to automatically find the optimal hyperparameters of a wettability prediction model. The results show that AHOS can automatically and efficiently obtain the optimal hyperparameters, thus improves model's generalization. In addition, the prediction modelbased on AHOS process can accurately predict shale wettability. Through sensitivity analysis, we also revealed that the TOC and gas type are the top two most important influencing factors, which provides insights for CO2 sequestration. Moreover, we validated that the CatBoost model outperforms other tree-basedmodels in predicting shale wettability and the SMBO method can provide a more effective result. In summary, the proposed shale wettability prediction framework is promising and efficient, which contributes to the exploration, evaluation, and development of shale oil reservoirs, further benefit to CO2 storage.
作者:
Bo, YinLiu, QuanshengHuang, XingPan, YucongWuhan Univ
Key Lab Geotech & Struct Engn Safety Hubei Prov Sch Civil Engn Wuhan 430072 Hubei Peoples R China Wuhan Univ
State Key Lab Water Resources & Hydropower Engn S Wuhan 430072 Peoples R China Chinese Acad Sci
Inst Rock & Soil Mech State Key Lab Geomech & Geotech Engn Wuhan 430071 Hubei Peoples R China
In-time perception of changing geological conditions is crucial for safe and efficient TBM tunneling. Precisely detecting or predicting the rock mass qualities ahead of the tunnel face can forewarn the geological disa...
详细信息
In-time perception of changing geological conditions is crucial for safe and efficient TBM tunneling. Precisely detecting or predicting the rock mass qualities ahead of the tunnel face can forewarn the geological disasters (e. g., burst or squeezing behaviors of surrounding rock mass). A novel hybridization modelbased on CatBooost and sequential model-based optimization (SMBO) is proposed in this study. Firstly, a database incorporating 4464 samples acquired from the Songhua River Water Diversion Project is established using the capping method. Owing to SMBO's different surrogate types (GP, RF, and GBRT) and performance validation, the comparisons of SMBO-CatBoost's three types and other six hybridized models (SMBO-XGBoost, SMBO-AdaBoost, SMBO-RF, SMBO-SVM, SMBO-KNN, and SMBO-LR) are successively carried out. As a result, in terms of the optimization speed, performance, and sensitivity to poor geological conditions, SMBO(RF)-CatBoost is the most suitable model for rock mass class prediction;furthermore, it achieves the best performance ACC = 0.9207 and F1 = 0.9178 among the seven hybridized models. Next, the scientific feature selection methods (i.e., filter, embedded) are used to reduce the model's complexity (i.e., feature dimensions) step by step to increase the model's on-site practicality. The determined ten influential features still can keep the model's ACC and F1 greater than 0.85, and only respectively declines 5.4% and 5.6% in contrast to the original performance. Subsequently, in order to explore the importance of the first-hand features and the second-hand features (i.e., composite features), a new method for more accurately calculating the rock mass boreability indices (regarded as the second-hand features) is proposed based on the big data at a relatively high sampling frequency of 1 Hz, this newly-proposed method could make these indices more of significance under the complex geological conditions. With the SHAP tech-nique, the modified torque penetr
The dynamic nature of renewable energy production and customer demand necessitates a flexible approach for designing Feed-in Tariff (FiT) schemes to ensure equity and fairness. This research presents a comprehensive d...
详细信息
The dynamic nature of renewable energy production and customer demand necessitates a flexible approach for designing Feed-in Tariff (FiT) schemes to ensure equity and fairness. This research presents a comprehensive data-driven framework for determining FiT rates by analyzing trends in demand, renewable energy generation, and temperature over time. The proposed method calculates FiT rates that adapt dynamically to evolving scenarios by incorporating both historical and projected trends. To optimize FiT values and offer affordable tariffs beneficial to both energy providers and customers, the proposed approach employs sequential model-based optimization (SMBO). Case studies using real-world microgrid data showcase the model's adaptability and confirm its reliability by ensuring that the optimized FiT values remain within Australian government-set tariff limits. The SMBO method can decrease computational time by as much as 90%, achieving a Root Mean Square Error of 2.839. Additionally, the dynamic FiT model enhances financial sustainability by shortening the payback period for various prosumers by 17%-22% compared to a fixed FiT. The dynamic FiT adjusts rates based on previous historical and projected trends, incentivizing prosumers to export energy during peak demand. This method supports sustainable energy usage and offers a flexible, efficient pricing mechanism that adapts to the changing energy landscape.
Adders are the primary components in the data-path logic of a microprocessor, and thus, adder design has been always a critical issue in the very large-scale integration (VLSI) industry. However, it is infeasible for ...
详细信息
Adders are the primary components in the data-path logic of a microprocessor, and thus, adder design has been always a critical issue in the very large-scale integration (VLSI) industry. However, it is infeasible for designers to obtain optimal adder architecture by exhaustively running EDA flow due to the extremely large design space. Previous arts have proposed the machine learning-based framework to explore the design space. Nevertheless, they fall into suboptimality due to a two-stage flow of the learning process and less efficient nor effective feature representations of prefix adder structures. In this article, we first integrate a variational graph autoencoder and a neural process (NP) into an end-to-end, multibranch framework, which is termed the graph neural process. The former performs automatic feature learning of prefix adder structures, whilst the latter one is designed as an alternative to the Gaussian process. Then, we propose a sequentialoptimization framework with the graph NP as the surrogate model to explore the Pareto-optimal prefix adder structures with tradeoff among Quality-of-Result (QoR) metrics, such as power, area, and delay. The experimental results show that compared with state-of-the-art methodologies, our framework can achieve a much better Pareto frontier in multiple QoR metric spaces with fewer design-flow evaluations.
Algorithm selection as well as hyperparameter optimization are tedious task that have to be dealt with when applying machine learning to real-world problems. sequential model-based optimization (SMBO), based on so-cal...
详细信息
Algorithm selection as well as hyperparameter optimization are tedious task that have to be dealt with when applying machine learning to real-world problems. sequential model-based optimization (SMBO), based on so-called "surrogate models", has been employed to allow for faster and more direct hyperparameter optimization. A surrogate model is a machine learning regression model which is trained on the meta-level instances in order to predict the performance of an algorithm on a specific data set given the hyperparameter settings and data set descriptors. Gaussian processes, for example, make good surrogate models as they provide probability distributions over labels. Recent work on SMBO also includes meta-data, i.e. observed hyperparameter performances on other data sets, into the process of hyperparameter optimization. This can, for example, be accomplished by learning transfer surrogate models on all available instances of meta-knowledge;however, the increasing amount of meta-information can make Gaussian processes infeasible, as they require the inversion of a large covariance matrix which grows with the number of instances. Consequently, instead of learning a joint surrogate model on all of the meta-data, we propose to learn individual surrogate models on the observations of each data set and then combine all surrogates to a joint one using ensembling techniques. The final surrogate is a weighted sum of all data set specific surrogates plus an additional surrogate that is solely learned on the target observations. Within our framework, any surrogate model can be used and explore Gaussian processes in this scenario. We present two different strategies for finding the weights used in the ensemble: the first is based on a probabilistic product of experts approach, and the second is based on kernel regression. Additionally, we extend the framework to directly estimate the acquisition function in the same setting, using a novel technique which we name the "transfer a
The performance of many machine learning algorithms depends crucially on the hyperparameter settings, especially in Deep Learning. Manually tuning the hyperparameters is laborious and time consuming. To address this i...
详细信息
The performance of many machine learning algorithms depends crucially on the hyperparameter settings, especially in Deep Learning. Manually tuning the hyperparameters is laborious and time consuming. To address this issue, Bayesian optimization (BO) methods and their extensions have been proposed to optimize the hyperparameters automatically. However, they still suffer from highly computational expense when applying to deep generative models (DGMs) due to their strategy of the black-box function optimization. This paper provides a new hyperparameter optimization procedure at the pre-training phase of the DGMs, where we avoid combining all layers as one black-box function by taking advantage of the layer-by-layer learning strategy. Following this procedure, we are able to optimize multiple hyperparameters in an adaptive way by using Gaussian process. In contrast to the traditional BO methods, which mainly focus on the supervised models, the pre-training procedure is unsupervised where there is no validation error can be used. To alleviate this problem, this paper proposes a new holdout loss, the free energy gap, which takes into account both factors of the model fitting and over-fitting. The empirical evaluations demonstrate that our method not only speeds up the process of hyperparameter optimization, but also improves the performances of DGMs significantly in both the supervised and unsupervised learning tasks. (C) 2017 Elsevier B.V. All rights reserved.
暂无评论