In order to improve the accuracy of semiconductor wafer virtual metrology, and overcome the physical metrology delay of wafer acceptance test, a virtual physical vapor deposition metrology method based on combination ...
详细信息
In order to improve the accuracy of semiconductor wafer virtual metrology, and overcome the physical metrology delay of wafer acceptance test, a virtual physical vapor deposition metrology method based on combination of tree-based ensemble models is proposed to conduct online virtual metrology on semiconductor wafer electrical parameters, and use hyperparameter optimization technique to perform modeloptimization and to achieve real-time alarm on process deviation. This combination of tree-based ensemble model combines Bagging, Boosting, and Stacking techniques. First, based on 4 types of base learner, Random Forest, Extra-Trees, XGBoost, and lightGBM, preliminary virtual metrology is performed on wafer PVD process, and then transforms the predict results of the 4 base learners into meta feature vector as the input of meta learner lightGBM to perform further virtual metrology. The sequential model-based optimization algorithm is used to improve the accuracy of virtual metrology. First, the initial hyperparameter of the sequential model-based optimization is initialized by using random sampling, then the combination model is approximated by the surrogate model of tree-structured Parzen estimator, and the recommended hyperparameters is obtained by using EI (Expected Improvement), and then the optimized combination model is obtained. Finally, the superiority of the method proposed in this paper is verified by studying the results comparing to the common virtual metrology methods on the PVD process. The experiment shows the result of resistivity metrology using the combination of tree-based ensemble models in the PVD process is significantly better than LASSO regression, partial least squares regression(PLSR), support vector machine(SVR), Gaussian process regression(GPR) and artificial neural network regression(ANN). (C) 2020 ISA. Published by Elsevier Ltd. All rights reserved.
Training agents over sequences of tasks is often employed in deep reinforcement learning to let the agents progress more quickly towards better behaviours. This problem, known as curriculum learning, has been mainly t...
详细信息
Training agents over sequences of tasks is often employed in deep reinforcement learning to let the agents progress more quickly towards better behaviours. This problem, known as curriculum learning, has been mainly tackled in the literature by numerical methods based on enumeration strategies, which, however, can handle only small size problems. In this work, we define a new optimization perspective to the curriculum learning problem with the aim of developing efficient solution methods for solving complex reinforcement learning tasks. Specifically, we show how the curriculum learning problem can be viewed as an optimization problem with a nonsmooth and nonconvex objective function and with an integer feasible region. We reformulate it by defining a grey-box function that includes a suitable scheduling problem. Numerical results on a benchmark environment in the reinforcement learning community show the effectiveness of the proposed approaches in reaching better performance also on large problems.
BayesOpt is a library with state-of-the-art Bayesian optimization methods to solve nonlinear optimization, stochastic bandits or sequential experimental design problems. Bayesian optimization characterized for being s...
详细信息
BayesOpt is a library with state-of-the-art Bayesian optimization methods to solve nonlinear optimization, stochastic bandits or sequential experimental design problems. Bayesian optimization characterized for being sample efficient as it builds a posterior distribution to capture the evidence and prior knowledge of the target function. Built in standard C++, the library is extremely efficient while being portable and flexible. It includes a common interface for C, C++, Python, Matlab and Octave.
In machine learning, hyperparameter optimization is a challenging task that is usually approached by experienced practitioners or in a computationally expensive brute-force manner such as grid-search. Therefore, recen...
详细信息
ISBN:
(纸本)9783319235257;9783319235240
In machine learning, hyperparameter optimization is a challenging task that is usually approached by experienced practitioners or in a computationally expensive brute-force manner such as grid-search. Therefore, recent research proposes to use observed hyperparameter performance on already solved problems (i.e. data sets) in order to speed up the search for promising hyperparameter configurations in the sequentialmodelbasedoptimization framework. In this paper, we propose multilayer perceptrons as surrogate models as they are able to model highly nonlinear hyperparameter response surfaces. However, since interactions of hyperparameters, data sets and metafeatures are only implicitly learned in the subsequent layers, we improve the performance of multilayer perceptrons by means of an explicit factorization of the interaction weights and call the resulting model a factorized multilayer perceptron. Additionally, we evaluate different ways of obtaining predictive uncertainty, which is a key ingredient for a decent tradeoff between exploration and exploitation. Our experimental results on two public meta data sets demonstrate the efficiency of our approach compared to a variety of published baselines. For reproduction purposes, we make our data sets and all the program code publicly available on our supplementary webpage.
Many advanced solving algorithms for constraint programming problems are highly configurable. The research area of algorithm configuration investigates ways of automatically configuring these solvers in the best manne...
详细信息
ISBN:
(数字)9783031080111
ISBN:
(纸本)9783031080111;9783031080104
Many advanced solving algorithms for constraint programming problems are highly configurable. The research area of algorithm configuration investigates ways of automatically configuring these solvers in the best manner possible. In this paper, we specifically focus on algorithm configuration in which the objective is to decrease the time it takes the solver to find an optimal solution. In this setting, adaptive capping is a popular technique which reduces the overall runtime of the search for good configurations by adaptively setting the solver's timeout to the best runtime found so far. Additionally, sequential model-based optimization (SMBO)-in which one iteratively learns a surrogate model that can predict the runtime of unseen configurations-has proven to be a successful paradigm. Unfortunately, adaptive capping and SMBO have thus far remained incompatible, as in adaptive capping, one cannot observe the true runtime of runs that time out, precluding the typical use of SMBO. To marry adaptive capping and SMBO, we instead use SMBO to model the probability that a configuration will improve on the best runtime achieved so far, for which we propose several decomposed models. These models also allow defining prior probabilities for each hyperparameter. The experimental results show that our DeCaprio method speeds up hyperparameter search compared to random search and the seminal adaptive capping approach of ParamILS.
Recent work has demonstrated that hyperparameter optimization within the sequential model-based optimization (SMBO) framework is generally possible. This approach replaces the expensive-to-evaluate function that maps ...
详细信息
ISBN:
(纸本)9781509001637
Recent work has demonstrated that hyperparameter optimization within the sequential model-based optimization (SMBO) framework is generally possible. This approach replaces the expensive-to-evaluate function that maps hyperparameters to the performance of a learned model on validation data by a surrogate model which is much cheaper to evaluate. The current state of the art in hyperparameter optimization learns these surrogate models across a variety of solved data sets where a grid search has already been employed. In this way, surrogate models are learned across data sets, and thus able to generalize better. However, meta features that describe characteristics of a data set are usually needed in order for the surrogate model to differentiate between same hyperparameter configurations on different data sets. Another research area that is closely related focuses on model choice, i.e. picking the right model for a given task, which is also a problem that many practitioners face in machine learning. In this paper, we aim to solve both of these problems with a unified surrogate model that learns across different data sets, different classifiers and their respective hyperparameters. We employ factorized multilayer perceptrons, a surrogate model that consists of a multilayer perceptron architecture, but offers the prediction of a factorization machine in the first layer. In this way, data sets, models and hyperparameters are being represented in a joint lower dimensional latent feature space. Experiments on a publicly available meta data set containing 59 individual data sets and 19 prediction models demonstrate the efficiency of our approach.
The random forest (RF) algorithm has several hyperparameters that have to be set by the user, for example, the number of observations drawn randomly for each tree and whether they are drawn with or without replacement...
详细信息
The random forest (RF) algorithm has several hyperparameters that have to be set by the user, for example, the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables drawn randomly for each split, the splitting rule, the minimum number of samples that a node must contain, and the number of trees. In this paper, we first provide a literature review on the parameters' influence on the prediction performance and on variable importance measures. It is well known that in most cases RF works reasonably well with the default values of the hyperparameters specified in software packages. Nevertheless, tuning the hyperparameters can improve the performance of RF. In the second part of this paper, after a presenting brief overview of tuning strategies, we demonstrate the application of one of the most established tuning strategies, model-basedoptimization (MBO). To make it easier to use, we provide the tuneRanger R package that tunes RF with MBO automatically. In a benchmark study on several datasets, we compare the prediction performance and runtime of tuneRanger with other tuning implementations in R and RF with default hyperparameters. This article is categorized under: Algorithmic Development > Biological Data Mining Algorithmic Development > Statistics Algorithmic Development > Hierarchies and Trees Technologies > Machine Learning
Although machine learning models have been employed for the compressive strength (CS) of cement-based mortar containing metakaolin, it is difficult to understand how they work due to "black-box "nature. In o...
详细信息
Although machine learning models have been employed for the compressive strength (CS) of cement-based mortar containing metakaolin, it is difficult to understand how they work due to "black-box "nature. In order to explain the involved mechanism, Categorical Gradient Boosting (CatBoost) model with feature importance, feature interaction, partial dependence plot (PDP) and SHapley Additive exPlanations (SHAP) is proposed in this paper. A dataset consisting of 424 samples with six input variables is used to build the CatBoost model, which has optimal performance by tuning a set of seven hyper-parameters using sequentialmodel -basedoptimization. Five quantitative measures (R-2, MAE, RMSE, a10-, a20-index) are employed to evaluate the accuracy and the obtained results are superior to the previous study. It is from feature importance that the most significant input variable involving the CS is water-to-binder ratio, followed by age of specimen and cement grade. The strongest feature interaction is between water-to-binder ratio and metakaolin. A comprehensive parametric study is carried out via SHAP and PDP to investigate the effects of all input variables on the CS of cement-based mortar.
Recurrent reinforcement learning (RRL) techniques have been used to optimize asset trading systems and have achieved outstanding results. However, the majority of the previous work has been dedicated to systems with d...
详细信息
Recurrent reinforcement learning (RRL) techniques have been used to optimize asset trading systems and have achieved outstanding results. However, the majority of the previous work has been dedicated to systems with discrete action spaces. To address the challenge of continuous action and multi-dimensional state spaces, we propose the so called Stacked Deep Dynamic Recurrent Reinforcement Learning (SDDRRL) architecture to construct a real-time optimal portfolio. The algorithm captures the up-to-date market conditions and rebalances the portfolio accordingly. Under this general vision, Sharpe ratio, which is one of the most widely accepted measures of risk-adjusted returns, has been used as a performance metric. Additionally, the performance of most machine learning algorithms highly depends on their hyperparameter settings. Therefore, we equipped SDDRRL with the ability to find the best possible architecture topology using an automated Gaussian Process (GP) with Expected Improvement (El) as an acquisition function. This allows us to select the best architectures that maximizes the total return while respecting the cardinality constraints. Finally, our system was trained and tested in an online manner for 20 successive rounds with data for ten selected stocks from different sectors of the S&P 500 from January 1st, 2013 to July 31st, 2017. The experiments reveal that the proposed SDDRRL achieves superior performance compared to three benchmarks: the rolling horizon Mean-Variance optimization (MVO) model, the rolling horizon risk parity model, and the uniform buy-and-hold (UBAH) index. (C) 2019 Elsevier Ltd. All rights reserved.
Recent work has demonstrated that hyperparameter optimization within the sequential model-based optimization (SMBO) framework is generally possible. This approach replaces the expensive-to-evaluate function that maps ...
详细信息
ISBN:
(纸本)9781509001644
Recent work has demonstrated that hyperparameter optimization within the sequential model-based optimization (SMBO) framework is generally possible. This approach replaces the expensive-to-evaluate function that maps hyperpa-rameters to the performance of a learned model on validation data by a surrogate model which is much cheaper to evaluate. The current state of the art in hyperparameter optimization learns these surrogate models across a variety of solved data sets where a grid search has already been employed. In this way, surrogate models are learned across data sets, and thus able to generalize better. However, meta features that describe characteristics of a data set are usually needed in order for the surrogate model to differentiate between same hyperparameter configurations on different data sets. Another research area that is closely related focuses on model choice, i.e. picking the right model for a given task, which is also a problem that many practitioners face in machine learning. In this paper, we aim to solve both of these problems with a unified surrogate model that learns across different data sets, different classifiers and their respective hyperparameters. We employ factorized multilayer perceptrons, a surrogate model that consists of a multilayer perceptron architecture, but offers the prediction of a factorization machine in the first layer. In this way, data sets, models and hyperparameters are being represented in a joint lower dimensional latent feature space. Experiments on a publicly available meta data set containing 59 individual data sets and 19 prediction models demonstrate the efficiency of our approach.
暂无评论