Several AutoML tools aim to facilitate the usability of machine learning algorithms, automatically recommending algorithms using techniques such as meta-learning, grid search, and genetic programming. However, the pre...
详细信息
Several AutoML tools aim to facilitate the usability of machine learning algorithms, automatically recommending algorithms using techniques such as meta-learning, grid search, and genetic programming. However, the preprocessing step is usually not well handled by those tools. Thus, in this work, we present a systematic review of preprocessing algorithms selection with meta-learning, aiming to find the state of the art in this field. To perform this task, we acquired 450 references, of which we selected 37 to be evaluated and analyzed according to a set of questions earlier defined. Thus, we managed to identify information such as what was published on the subject;the topics more often presented in those works;the most frequently recommended preprocessing algorithms;the most used features selected to extract information for the meta-learning;the machine learning algorithms employed as meta-learners and base-learners in those works;and the performance metrics that are chosen as the target of the applications.
We propose an algorithm selection approach and an instance space analysis for the well-known curriculum-based course timetabling problem (CB-CTT), which is an important problem for its application in higher education....
详细信息
We propose an algorithm selection approach and an instance space analysis for the well-known curriculum-based course timetabling problem (CB-CTT), which is an important problem for its application in higher education. Several state of the art algorithms exist, including both exact and metaheuristic methods. Results of these algorithms on existing instances in the literature show that there is no single algorithm outperforming the others. Therefore, a deep analysis of the strengths and weaknesses of these algorithms, depending on the instance, is an important research question. In this work, a detailed analysis of the instance space for CB-CTT is performed, charting the regions where these algorithms perform best. We further investigate the application of machine learning methods to automated algorithm selection for CB-CTT, strengthening the insights gained through the instance space analysis. For our research, we contribute new real-life instances and extend the generation of synthetic instances to better correspond to these new instances. Finally, this work shows how instance space analysis and the application of algorithm selection complement each other, underlining the value of both approaches in understanding algorithm performance.
The process of identifying the most suitable optimization algorithm fora specific problem, referred to as algorithm selection (AS), entails training models that leverage problem landscape features to forecast algorith...
详细信息
The process of identifying the most suitable optimization algorithm fora specific problem, referred to as algorithm selection (AS), entails training models that leverage problem landscape features to forecast algorithm performance. A significant challenge in this domain is ensuring that AS models can generalize effectively to novel, unseen problems. This study evaluates the generalizability of AS models based on different problem representations in the context of single-objective continuous optimization. In particular, it considers the most widely used Exploratory Landscape Analysis features, as well as recently proposed Topological Landscape Analysis features, and features based on deep learning, such as DeepELA, TransOptAS and Doe2Vec. Our results indicate that when presented with out-of-distribution evaluation data, none of the feature-based AS models outperform a simple baseline model, i.e., a Single Best Solver.
algorithm selection and hyperparameter tuning are critical steps in both academic and applied machine learning (ML). These steps are becoming increasingly delicate due to the extensive rise in the number, diversity, a...
详细信息
algorithm selection and hyperparameter tuning are critical steps in both academic and applied machine learning (ML). These steps are becoming increasingly delicate due to the extensive rise in the number, diversity, and distributed nature of ML resources. Multi-agent systems, when applied to the design of ML platforms, bring about several distinctive characteristics, such as scalability, flexibility, and robustness, just to name a few. This article proposes a fully automatic and collaborative agent-based mechanism for selecting distributed ML algorithms and simultaneously tuning their hyperparameters. Our method builds upon an existing agent-based hierarchical ML platform and augments its query structure to support the aforementioned functionalities without being limited to specific learning, selection, and tuning mechanisms. We have conducted theoretical assessments, formal verification, and analytical study to demonstrate the correctness, resource utilization, and computational efficiency of our technique. According to the results, our solution is algorithmically correct and exhibits linear time and space complexity in relation to the size of available resources. To further verify its correctness and demonstrate its effectiveness and flexibility across a range of algorithmic options and datasets, the article also presents a series of empirical results on a system composed of 24 algorithms and 9 datasets. The findings not only highlight the efficiency and scalability of the proposed approach, but also show its flexibility and openness to responding to the dynamic and distributed ML ecosystem.
Training a large set of machine learning algorithms to convergence in order to select the best-performing algorithm for a dataset is computationally wasteful. Moreover, in a budget-limited scenario, it is crucial to c...
详细信息
Training a large set of machine learning algorithms to convergence in order to select the best-performing algorithm for a dataset is computationally wasteful. Moreover, in a budget-limited scenario, it is crucial to carefully select an algorithm candidate and allocate a budget for training it, ensuring that the limited budget is optimally distributed to favor the most promising candidates. Casting this problem as a Markov Decision Process, we propose a novel framework in which an agent must select in the process of learning the most promising algorithm without waiting until it is fully trained. At each time step, given an observation of partial learning curves of algorithms, the agent must decide whether to allocate resources to further train the most promising algorithm (exploitation), to wake up another algorithm previously put to sleep, or to start training a new algorithm (exploration). In addition, our framework allows the agent to meta-learn from learning curves on past datasets along with dataset meta-features and algorithm hyperparameters. By incorporating meta-learning, we aim to avoid myopic decisions based solely on premature learning curves on the dataset at hand. We introduce two benchmarks of learning curves that served in international competitions at WCCI'22 and AutoML-conf'22, of which we analyze the results. Our findings show that both meta-learning and the progression of learning curves enhance the algorithm selection process, as evidenced by methods of winning teams and our DDQN baseline, compared to heuristic baselines or a random search. Interestingly, our costeffective baseline, which selects the best-performing algorithm w.r.t. a small budget, can perform decently when learning curves do not intersect frequently.
Evolutionary algorithms, such as differential evolution, excel in solving real-parameter optimization challenges. However, the effectiveness of a single algorithm varies across different problem instances, necessitati...
详细信息
Evolutionary algorithms, such as differential evolution, excel in solving real-parameter optimization challenges. However, the effectiveness of a single algorithm varies across different problem instances, necessitating considerable efforts in algorithm selection or configuration. This article aims to address the limitation by leveraging the complementary strengths of a group of algorithms and dynamically scheduling them throughout the optimization progress for specific problems. We propose a deep reinforcement learning-based dynamic algorithm selection framework to accomplish this task. Our approach models the dynamic algorithm selection a Markov decision process, training an agent in a policy gradient manner to select the most suitable algorithm according to the features observed during the optimization process. To empower the agent with the necessary information, our framework incorporates a thoughtful design of landscape and algorithmic features. Meanwhile, we employ a sophisticated deep neural network model to infer the optimal action, ensuring informed algorithm selections. Additionally, an algorithm context restoration mechanism is embedded to facilitate smooth switching among different algorithms. These mechanisms together enable our framework to seamlessly select and switch algorithms in a dynamic online fashion. Notably, the proposed framework is simple and generic, offering potential improvements across a broad spectrum of evolutionary algorithms. As a proof-of-principle study, we apply this framework to a group of differential evolution algorithms. The experimental results showcase the remarkable effectiveness of the proposed framework, not only enhancingthe overall optimization performance but also demonstrating favorable generalization ability across different problem classes.
Classic automated algorithm selection (AS) for (combinatorial) optimization problems heavily relies on so-called instance features, i.e., numerical characteristics of the problem at hand ideally extracted with computa...
详细信息
Classic automated algorithm selection (AS) for (combinatorial) optimization problems heavily relies on so-called instance features, i.e., numerical characteristics of the problem at hand ideally extracted with computationally low-demanding routines. For the traveling salesperson problem (TSP) a plethora of features have been suggested. Most of these features are, if at all, only normalized imprecisely raising the issue of feature values being strongly affected by the instance size. Such artifacts may have detrimental effects on algorithm selection models. We propose a normalization for two feature groups which stood out in multiple AS studies on the TSP: (a) features based on a minimum spanning tree (MST) and (b) nearest neighbor relationships of the input instance. To this end we theoretically derive minimum and maximum values for properties of MSTs and k-nearest neighbor graphs (NNG) of Euclidean graphs. We analyze the differences in feature space between normalized versions of these features and their unnormalized counterparts. Our empirical investigations on various TSP benchmark sets point out that the feature scaling succeeds in eliminating the effect of the instance size. A proof-of-concept AS-study shows promising results: models trained with normalized features tend to outperform those trained with the respective vanilla features.(c) 2022 Elsevier B.V. All rights reserved.
algorithm selection (AS) tasks are dedicated to find the optimal algorithm for an unseen problem instance. With the knowledge of problem instances' meta-features and algorithms' landmark performances, Machine ...
详细信息
algorithm selection (AS) tasks are dedicated to find the optimal algorithm for an unseen problem instance. With the knowledge of problem instances' meta-features and algorithms' landmark performances, Machine Learning (ML) approaches are applied to solve AS problems. However, the standard training process of benchmark ML approaches in AS either needs to train the models specifically for every algorithm or relies on the sparse one-hot encoding as the algorithms' representation. To escape these intermediate steps and form the mapping function directly, we borrow the learning to rank framework from Recommender System (RS) and embed the bi-linear factorization to model the algorithms' performances in AS. This Bi-linear Learning to Rank (BLR) has proven to work with competence in some AS scenarios and thus is also proposed as a benchmark approach. Thinking from the evaluation perspective in the modern AS challenges, precisely predicting the performance is usually the measuring goal. Though approaches' inference time also needs to be counted for the running time cost calculation, it's always overlooked in the evaluation process. The multi-objective evaluation metric Adjusted Ratio of Root Ratios (A3R) is therefore advocated in this paper to balance the trade-off between the accuracy and inference time in AS. Concerning A3R, BLR outperforms other benchmarks when expanding the candidates range toTOP3. The better effect of this candidates expansion results from the cumulative optimum performance during the AS process. We take the further step in the experimentation to represent the advantage of suchTOPKexpansion, and illustrate that such expansion can be considered as the supplement for the convention ofTOP1 selection during the evaluation process.
We propose a novel technique for algorithm-selection, applicable to optimisation domains in which there is implicit sequential information encapsulated in the data, e.g., in online bin-packing. Specifically we train t...
详细信息
We propose a novel technique for algorithm-selection, applicable to optimisation domains in which there is implicit sequential information encapsulated in the data, e.g., in online bin-packing. Specifically we train two types of recurrent neural networks to predict a packing heuristic in online bin-packing, selecting from four well-known heuristics. As input, the RNN methods only use the sequence of item-sizes. This contrasts to typical approaches to algorithm-selection which require a model to be trained using domain-specific instance features that need to be first derived from the input data. The RNN approaches are shown to be capable of achieving within 5% of the oracle performance on between 80.88 and 97.63% of the instances, depending on the dataset. They are also shown to outperform classical machine learning models trained using derived features. Finally, we hypothesise that the proposed methods perform well when the instances exhibit some implicit structure that results in discriminatory performance with respect to a set of heuristics. We test this hypothesis by generating fourteen new datasets with increasing levels of structure, and show that there is a critical threshold of structure required before algorithm-selection delivers benefit.
The recommender systems algorithm selection problem for ranking prediction on implicit feedback datasets is under-explored. Traditional approaches in recommender systems algorithm selection focus predominantly on rati...
详细信息
ISBN:
(纸本)9798400705052
The recommender systems algorithm selection problem for ranking prediction on implicit feedback datasets is under-explored. Traditional approaches in recommender systems algorithm selection focus predominantly on rating prediction on explicit feedback datasets, leaving a research gap for ranking prediction on implicit feedback datasets. algorithm selection is a critical challenge for nearly every practitioner in recommender systems. In this work, we take the first steps toward addressing this research gap. We evaluate the NDCG@10 of 24 recommender systems algorithms, each with two hyperparameter configurations, on 72 recommender systems datasets. We train four optimized machine-learning meta-models and one automated machine-learning meta-model with three different settings on the resulting meta-dataset. Our results show that the predictions of all tested meta-models exhibit a median Spearman correlation ranging from 0.857 to 0.918 with the ground truth. We show that the median Spearman correlation between meta-model predictions and the ground truth increases by an average of 0.124 when the meta-model is optimized to predict the ranking of algorithms instead of their performance. Furthermore, in terms of predicting the best algorithm for an unknown dataset, we demonstrate that the best optimized traditional meta-model, e.g., XGBoost, achieves a recall of 48.6%, outperforming the best tested automated machine learning meta-model, e.g., AutoGluon, which achieves a recall of 47.2%.
暂无评论