In machine learning (ML), hyperparameter optimization (HPO) is the process of choosing a tuple of values that ensures an efficient deployment and training of an AI model. In practice, HPO not only applies to ML tuning...
详细信息
In machine learning (ML), hyperparameter optimization (HPO) is the process of choosing a tuple of values that ensures an efficient deployment and training of an AI model. In practice, HPO not only applies to ML tuning but can also be used to tune complex numerical simulations. In this context, a numerical model of a given object is created to be used in realistic simulations. This model is defined by a set of values describing properties such as the geometry of the object or other unknown parameters related to physical quantities. While HPO for ML usually requires finding a few parameters, a numerical model can involve the tuning of more than a hundred parameters. As a consequence, a large number of tuples have to be explored and evaluated before finding a relevant solution, offering new challenges in high-performance computing for efficiently driving the optimization. In this work we rely on the Optuna HPO framework, primarily designed for ML tasks and including state-of-the-art sampling and pruning algorithms. We report on its use to optimize a complex numerical model onto a 1024-core machine. We suggest 1.5M tuples and evaluate 5M simulations using different Optuna-distributed layouts to build several tradeoffs between performance and energy consumption metrics. In order to further scale up the optimization process onto resources, we introduce OptunaP2P, an extension of Optuna based on the peer-to-peer paradigm. This allows to remove any bottleneck in the management of the shared knowledge between optimization processes. With OptunaP2P, we were able to compute up to 3 times faster compared to the regular Optuna-distributed implementation and to obtain close-to-similar results in terms of quality in this reduced time-frame.
Because of the good performance of convolutional neural network (CNN), it has been extensively used in many fields, such as image, speech, text, etc. However, it is easily affected by hyperparameters. How to effective...
详细信息
Because of the good performance of convolutional neural network (CNN), it has been extensively used in many fields, such as image, speech, text, etc. However, it is easily affected by hyperparameters. How to effectively configure hyperparameters at a reasonable time to improve the performance of CNNs has always been a complex problem. To solve this problem, this paper proposes a method to automatically optimize CNN hyperparameters based on the local autonomous competitive harmony search (LACHS) algorithm. To avoid the influence of complicated parameter adjustment of LACHS algorithm on its performance, a parameter dynamic adjustment strategy is adopted, which makes the pitch adjustment probability PAR and step factor BW dynamically adjust according to the actual situation. To strengthen the fine search of neighborhood space and reduce the possibility of falling into local optima for a long time, an autonomous decision-making search strategy based on the optimal state is designed. To help the algorithm jump out of the local fitting situation, this paper proposes a local competition mechanism to make the new sound competes with the worst harmonic progression of local selection. In addition, an evaluation function is proposed, which integrates the training times and recognition accuracy. To achieve the purpose of saving the calculation cost without affecting the search result, it makes the training time for each model depending on the learning rate and batch size. In order to prove the feasibility of LACHS algorithm in configuring CNN superparameters, the classification of the Fashion-MNIST dataset and CIFAR10 dataset is tested. The comparison is made between CNN based on empirical configuration and CNN based on classical algorithms to optimize hyperparameters automatically. The results show that the performance of CNN based on the LACHS algorithm has been improved effectively, so this algorithm has certain advantages in hyperparametric optimization. In addition, this p
Context: Federated Learning (FL) has emerged as a promising, massively distributed way to train a joint deep model across numerous edge devices, ensuring user data privacy by retaining it on the device. In FL, Hyperpa...
详细信息
ISBN:
(纸本)9798350316971
Context: Federated Learning (FL) has emerged as a promising, massively distributed way to train a joint deep model across numerous edge devices, ensuring user data privacy by retaining it on the device. In FL, hyperparameters (HP) significantly affect the training overhead regarding computation and transmission time, computation and transmission load, as well as model accuracy. This paper presents a novel approach where hyperparameters optimization (HPO) is used to optimize the performance of the FL model for Speech Emotion Recognition (SER) application. To solve this problem, both Single-Objective optimization (SOO) and Multi-Objective optimization (MOO) models are developed and evaluated. The optimization model includes two objectives: accuracy and total execution time. Numerical results show that optimal hyperparameters (HP) settings allow for improving both the accuracy of the model and its computation time. The proposed method assists FL system designers in finding optimal parameters setup, allowing them to carry out model design and development efficiently depending on their goals.
The use of convolutional neural networks involves hyperparameters optimization. Gaussian process based Bayesian optimization (GPEI) has proven to be an effective algorithm to optimize several hyperparameters. Then dee...
详细信息
The use of convolutional neural networks involves hyperparameters optimization. Gaussian process based Bayesian optimization (GPEI) has proven to be an effective algorithm to optimize several hyperparameters. Then deep networks for global optimization algorithm (DNGO) that used neural network as an alternative to Gaussian process was proposed to optimize more hyperparameters. This paper presents a new algorithm that combines multiscale and multilevel evolutionary optimization (MSMLEO) with GPEI to optimize dozens of hyperparameters. These hyperparameters are divided into two groups. The first group related with the sizes of layers and kernels are discrete integers. The second group related with learning rates and so on is continuous floating-point numbers. All combinations of the first group are corresponding to the combinations of grid points on multi-scale grids and MSMLEO launches GPEI to optimize the second group of hyperparameters while the first group keeps fixed. The output of convolutional networks configured with above two groups of optimized hyperparameters is used as the fitness of MSMLEO. MSMLEO alternates with GPEI to search the optimal hyperparameters from coarsest scale to finest scale. Experimental results show that our algorithm has better performance and adaptability on optimizing dozens of hyperparameters of neural networks with a variety of numerical types. (C) 2019 Published by Elsevier B.V.
Hyperparameter optimization plays a significant role in the overall performance of machine learning algorithms. However, the computational cost of algorithm evaluation can be extremely high for complex algorithm or la...
详细信息
Hyperparameter optimization plays a significant role in the overall performance of machine learning algorithms. However, the computational cost of algorithm evaluation can be extremely high for complex algorithm or large dataset. In this paper, we propose a model-based reinforcement learning with experience variable and meta-learning optimization method to speed up the training process of hyperparameter optimization. Specifically, an RL agent is employed to select hyperparameters and treat the k-fold cross-validation result as a reward signal to update the agent. To guide the agent's policy update, we design an embedding representation called "experience variable" and dynamically update it during the training process. Besides, we employ a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters and limit the model rollout in short horizon to reduce the impact of the inaccuracy of the model. Finally, we use the meta-learning technique to pre-train the model for fast adapting to a new task. To prove the advantages of our method, we conduct experiments on 25 real HPO tasks and the experimental results show that with the limited computational resources, the proposed method outperforms the state-of-the-art Bayesian methods and evolution method.
The performance of a convolutional neural network (CNN) heavily depends on its hyperparameters. However, finding a suitable hyperparameters configuration is difficult, challenging, and computationally expensive due to...
详细信息
The performance of a convolutional neural network (CNN) heavily depends on its hyperparameters. However, finding a suitable hyperparameters configuration is difficult, challenging, and computationally expensive due to three issues, which are 1) the mixed-variable problem of different types of hyperparameters;2) the large-scale search space of finding optimal hyperparameters;and 3) the expensive computational cost for evaluating candidate hyperparameters configuration. Therefore, this article focuses on these three issues and proposes a novel estimation of distribution algorithm (EDA) for efficient hyperparameters optimization, with three major contributions in the algorithm design. First, a hybrid-model EDA is proposed to efficiently deal with the mixed-variable difficulty. The proposed algorithm uses a mixed-variable encoding scheme to encode the mixed-variable hyperparameters and adopts an adaptive hybrid-model learning (AHL) strategy to efficiently optimize the mixed-variables. Second, an orthogonal initialization (OI) strategy is proposed to efficiently deal with the challenge of large-scale search space. Third, a surrogate-assisted multi-level evaluation (SME) method is proposed to reduce the expensive computational cost. Based on the above, the proposed algorithm is named surrogate-assisted hybrid-model EDA (SHEDA). For experimental studies, the proposed SHEDA is verified on widely used classification benchmark problems, and is compared with various state-of-the-art methods. Moreover, a case study on aortic dissection (AD) diagnosis is carried out to evaluate its performance. Experimental results show that the proposed SHEDA is very effective and efficient for hyperparameters optimization, which can find a satisfactory hyperparameters configuration for the CIFAR10, CIFAR100, and AD diagnosis with only 0.58, 0.97, and 1.18 GPU days, respectively.
Learning processes play an important role in enhancing understanding and analyzing real phenomena. Most of these methodologies revolve around solving penalized optimization problems. A significant challenge arises in ...
详细信息
Learning processes play an important role in enhancing understanding and analyzing real phenomena. Most of these methodologies revolve around solving penalized optimization problems. A significant challenge arises in the choice of the penalty hyperparameter, which is typically user -specified or determined through Grid search approaches. There is a lack of automated tuning procedures for the estimation of these hyperparameters, particularly in unsupervised learning scenarios. In this paper, we focus on the unsupervised context and propose a bilevel strategy to address the issue of tuning the penalty hyperparameter. We establish suitable conditions for the existence of a minimizer in an infinite -dimensional Hilbert space, along with presenting some theoretical considerations. These results can be applied in situations where obtaining an exact minimizer is unfeasible. Working on the estimation of the hyperparameter with the gradient-based method, we also introduce a modified version of Ekeland's principle as a stopping criterion for these methods. Our approach distinguishes from conventional techniques by reducing reliance on random or black-box strategies, resulting in stronger mathematical generalization.
Due to difficulties such as multiple local optima and flat landscape, it is suggested to use global optimization techniques to discover the global optimum of the auxiliary optimization problem of finding good Gaussian...
详细信息
ISBN:
(纸本)9781450367486
Due to difficulties such as multiple local optima and flat landscape, it is suggested to use global optimization techniques to discover the global optimum of the auxiliary optimization problem of finding good Gaussian Processes (GP) hyperparameters. We investigated the performance of genetic algorithms (GA), particle swarm optimization (PSO), differential evolution (DE), and covariance matrix adaptation evolution strategy (CMA-ES) for optimizing hyperparameters of GP. The study was performed on two artificial problems and also one real-world problem. From the results, we observe that PSO, CMA-ES, and DE/local-to-best/1 consistently outperformed two variants of GA and DE/rand/1 with per-generation-dither on all problems. In particular, CMA-ES is an attractive method since it is quasi-parameter free and it also demonstrates good exploitative and explorative power on optimizing the hyperparameters.
Smart transportation is an essential component of the smart city. Traffic prediction is an important issue in smart transportation. The convolutional neural networks (GCN) are an effective approach for traffic predict...
详细信息
Smart transportation is an essential component of the smart city. Traffic prediction is an important issue in smart transportation. The convolutional neural networks (GCN) are an effective approach for traffic prediction. However, the GCN meets some challenges, such as stability of prediction precision and computation cost, in traffic prediction. The hyperparameters significantly affect the performance of GCN. We conduct a regression analysis between hyperparameters and GCN performance. Our simulation results show that there is the obvious optimal point of hyperparameters. Some empirical suggestion is given to adjust the hyperparameters based on the simulation results.
Deep neural networks (DNNs) have been widely applied to the synthetic aperture radar (SAR) images detection and classification recently while different kinds of adversarial attacks from malicious adversary and the hid...
详细信息
Deep neural networks (DNNs) have been widely applied to the synthetic aperture radar (SAR) images detection and classification recently while different kinds of adversarial attacks from malicious adversary and the hidden vulnerability of DNNs may lead to serious security threats. The state-of-the-art DNNs-based SAR image detection models are designed manually by only considering the test accuracy performance on clean datasets but neglecting the models' adversarial robustness under various types of adversarial attacks. In order to obtain the best trade-off between the clean accuracy and adversarial robustness in robust convolutional neural networks (CNNs)-based SAR image classification models, this work makes the first attempt to develop a multi-objective adversarially robust CNN, called MoAR-CNN. In the MoAR-CNN, we propose a multi-objective automatic design method of the cells-based neural architectures and some critical hyperparameters such as the optimizer type and learning rate of CNNs. A Squeeze-and-Excitation (SE) layer is introduced after each cell to improve the computational efficiency and robustness. The experiments on FUSAR-Ship and OpenSARShip datasets against seven types of adversarial attacks have been implemented to demonstrate the superiority of the proposed MoAR-CNN to six classical manually designed CNNs and four robust neural architectures search methods in terms of clean accuracy, adversarial accuracy, and model size. Furthermore, we also demonstrate the advantages of using SE layer in MoAR-CNN, the transferability of MoAR-CNN, search costs, adversarial training, and the developed NSGA-II in MoAR-CNN through experiments.
暂无评论