The representation of individuals in geneticprogramming (GP) has a large impact on the evolutionary process. In previous work, we investigated the evolutionary process of three grammar-guided GP (GGGP) methods, Conte...
详细信息
The representation of individuals in geneticprogramming (GP) has a large impact on the evolutionary process. In previous work, we investigated the evolutionary process of three grammar-guided GP (GGGP) methods, Context-Free grammars GP (CFG-GP), Grammatical Evolution (GE) and Structured Grammatical Evolution (SGE), in the context of the complex, real-world problem of predicting the glucose level of people with diabetes two hours ahead of time. We concluded that representation choice is more impactful with a higher maximum depth, and that CFG-GP better explores the search space for deeper trees, achieving better results. Furthermore, we find that CFG-GP relies more on feature construction, whereas GE and SGE rely more on feature selection. Additionally, we altered the GGGP methods in two ways: using & varepsilon;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon$$\end{document}-lexicase selection, which solved the overfitting problem of CFG-GP and helps it to adapt to patients with high glucose variability;and with a penalization of complex trees, to create more interpretable trees. Combining & varepsilon;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon$$\end{document}-lexicase selection with CFG-GP performed best. In this work, we extend on the previous work and evaluated the impact of initialization methods in the quality of solutions. We found that they have no significant impact, even when the change of representation has.
genetic variation operators in grammar-guided genetic programming are fundamental to guide the evolutionary process in search and optimization problems. However, they show some limitations, mainly derived from an unba...
详细信息
genetic variation operators in grammar-guided genetic programming are fundamental to guide the evolutionary process in search and optimization problems. However, they show some limitations, mainly derived from an unbalanced exploration and local-search trade-off. This paper presents an estimation of distribution algorithm for grammar-guided genetic programming to overcome this difficulty and thus increase the performance of the evolutionary algorithm. Our proposal employs an extended dynamic stochastic context-free grammar to encode and calculate the estimation of the distribution of the search space from some promising individuals in the population. Unlike traditional estimation of distribution algorithms, the proposed approach improves exploratory behavior by smoothing the estimated distribution model. Therefore, this algorithm is referred to as SEDA, smoothed estimation of distribution algorithm. Experiments have been conducted to compare overall performance using a typical geneticprogramming crossover operator, an incremental estimation of distribution algorithm, and the proposed approach after tuning their hyperparameters. These experiments involve challenging problems to test the local search and exploration features of the three evolutionary systems. The results show that grammar-guided genetic programming with SEDA achieves the most accurate solutions with an intermediate convergence speed.
Multi-label classification has been used to solve a wide range of problems where each example in the dataset may be related either to one class (as in traditional classification problems) or to several class labels at...
详细信息
Multi-label classification has been used to solve a wide range of problems where each example in the dataset may be related either to one class (as in traditional classification problems) or to several class labels at the same time. Many ensemble-based approaches have been proposed in the literature, aiming to improve the performance of traditional multi-label classification algorithms. However, most of them do not consider the data characteristics to build the ensemble, and those that consider them need to tune many parameters to maximize their performance. In this paper, we propose an Auto-adaptive algorithm based on grammar-guided genetic programming to generate Ensembles of Multi-Label Classifiers based on projections of k labels (AG3P-kEMLC). It creates a tree-shaped ensemble, where each leaf is a multi-label classifier focused on a subset of k labels. Unlike other methods in the literature, our proposal can deal with different values of k in the same ensemble, instead of fixing one specific value. It also includes an auto-adaptive process to reduce the number of hyper-parameters to tune, prevent overfitting and reduce the runtime required to execute it. Three versions of the algorithm are proposed. The first, fixed, uses the same value of k for all multi-label classifiers in the ensemble. The remaining two deal with different k values in the ensemble: uniform gives the same probability to choose each available value of k, and gaussian favors the selection of smaller values of k. The experimental study carried out considering twenty reference datasets and five evaluation metrics, compared with eleven ensemble methods demonstrates that our proposal performs significantly better than the state-of-the-art methods.
The representation of individuals in geneticprogramming (GP) has a large impact on the evolutionary process. In this work, we investigate the evolutionary process of three grammar-guided GP (GGGP) methods, Context-Fr...
详细信息
ISBN:
(纸本)9798400701207
The representation of individuals in geneticprogramming (GP) has a large impact on the evolutionary process. In this work, we investigate the evolutionary process of three grammar-guided GP (GGGP) methods, Context-Free grammars GP (CFG-GP), Grammatical Evolution (GE) and Structured Grammatical Evolution (SGE), in the context of the complex, real-world problem of predicting the glucose level of people with diabetes two hours ahead of time. Our analysis differs from previous analyses by (1) comparing all three methods on a complex benchmark, (2) implementing the methods in the same framework, allowing a fairer comparison, and (3) analyzing the evolutionary process outside of performance. We conclude that representation choice is more impactful with a higher maximum depth, and that CFG-GP better explores the search space for deeper trees, achieving better results. Furthermore, we find that CFG-GP relies more on feature construction, whereas GE and SGE rely more on feature selection. Finally, we altered the GGGP methods in two ways: using is an element of-lexicase selection, which solved the overfitting problem of CFG-GP;and with a penalization of complex trees, to create more interpretable trees. Combining is an element of-lexicase selection with CFG-GP performed best.
We formulate a formal grammar to generate Full Approximation Scheme multigrid solvers. Then, using grammar-guided genetic programming we perform a multiobjective optimization to find optimal instances of such solvers ...
详细信息
ISBN:
(纸本)9798400701207
We formulate a formal grammar to generate Full Approximation Scheme multigrid solvers. Then, using grammar-guided genetic programming we perform a multiobjective optimization to find optimal instances of such solvers for a given nonlinear system of equations. This approach is evaluated for a two-dimensional Poisson problem with added nonlinearities. We observe that the evolved solvers outperform the baseline methods by having a faster runtime and a higher convergence rate.
Feature Learning (FL) is key to well-performing machine learning models. However, the most popular FL methods lack interpretability, which is becoming a critical requirement of Machine Learning. We propose to incorpor...
详细信息
ISBN:
(数字)9783031295737
ISBN:
(纸本)9783031295720;9783031295737
Feature Learning (FL) is key to well-performing machine learning models. However, the most popular FL methods lack interpretability, which is becoming a critical requirement of Machine Learning. We propose to incorporate information from the problem domain in the structure of programs on top of the existing M3GP approach. This technique, named Domain-Knowledge M3GP, works by defining the possible feature transformations using a grammar through grammar-guided genetic programming. While requiring the user to specify the domain knowledge, this approach has the advantage of limiting the search space, excluding programs that make no sense to humans. We extend this approach with the possibility of introducing complex, aggregating queries over historic data. This extension allows to expand the search space to include relevant programs that were not possible before. We evaluate our methods on performance and interpretability in 6 use cases, showing promising results in both areas. We conclude that performance and interpretability of FL methods can benefit from domain-knowledge incorporation and aggregation, and give guidelines on when to use them.
Since geneticprogramming (GP) has been proposed, several flavors of GP have arisen, each with their own strengths and limitations. grammar-guided and Strongly-Typed GP (GGGP and STGP, respectively) are two popular fl...
详细信息
ISBN:
(纸本)9798400701191
Since geneticprogramming (GP) has been proposed, several flavors of GP have arisen, each with their own strengths and limitations. grammar-guided and Strongly-Typed GP (GGGP and STGP, respectively) are two popular flavors that have the advantage of allowing the practitioner to impose syntactic and semantic restrictions on the generated programs. GGGP makes use of (traditionally contextfree) grammars to restrict the generation of (and the application of genetic operators on) individuals. By guiding this generation according to a grammar, i.e. a set of rules, GGGP improves performance by searching for an good-enough solution on a subset of the search space. This approach has been extended with Attribute grammars to encode semantic restrictions, while ContextFree grammars would only encode syntactic restrictions. STGP is also able to restrict the shape of the generated programs using a very simple grammar together with a type system. In this work, we address the question of which approach has more expressive power. We demonstrate that STGP has higher expressive power than Context-Free GGGP and less expressive power than Attribute Grammatical Evolution.
geneticprogramming (GP) is an heuristic method that can be applied to many Machine Learning, Optimization and Engineering problems. In particular, it has been widely used in Software Engineering for Test-case generat...
详细信息
ISBN:
(纸本)9781450399203
geneticprogramming (GP) is an heuristic method that can be applied to many Machine Learning, Optimization and Engineering problems. In particular, it has been widely used in Software Engineering for Test-case generation, Program Synthesis and Improvement of Software (GI). grammar-guided genetic programming (GGGP) approaches allow the user to refine the domain of valid program solutions. Backus Normal Form is the most popular interface for describing Context-Free grammars (CFG) for GGGP. BNF and its derivatives have the disadvantage of interleaving the grammar language and the target language of the program. We propose to embed the grammar as an internal Domain-Specific Language in the host language of the framework. This approach has the same expressive power as BNF and EBNF while using the host language type-system to take advantage of all the existing tooling: linters, formatters, type-checkers, autocomplete, and legacy code support. These tools have a practical utility in designing software in general, and GP systems in particular. We also present Meta-Handlers, user-defined overrides of the tree-generation system. This technique extends our object-oriented encoding with more practicability and expressive power than existing CFG approaches, achieving the same expressive power of Attribute grammars, but without the grammar vs target language duality. Furthermore, we evidence that this approach is feasible, showing an example Python implementation as proof. We also compare our approach against textual BNF-representations w.r.t. expressive power and ergonomics. These advantages do not come at the cost of performance, as shown by our empirical evaluation on 5 benchmarks of our example implementation against PonyGE2. We conclude that our approach has better ergonomics with the same expressive power and performance of textual BNF-based grammar encodings.
grammar-guided genetic programming is widely recognised as one of the most successful approaches for program synthesis, i.e., the task of automatically discovering an executable piece of code given user intent. Gramma...
详细信息
ISBN:
(纸本)9783031220388;9783031220395
grammar-guided genetic programming is widely recognised as one of the most successful approaches for program synthesis, i.e., the task of automatically discovering an executable piece of code given user intent. grammar-guided genetic programming has been shown capable of successfully evolving programs in arbitrary languages that solve several program synthesis problems based only on a set of input-output examples. Despite its success, the restriction on the evolutionary system to only leverage input/output error rate during its assessment of the programs it derives limits its scalability to larger and more complex program synthesis problems. With the growing number and size of open software repositories and generative artificial intelligence approaches, there is a sizeable and growing number of approaches for retrieving/generating source code based on textual problem descriptions. Therefore, it is now, more than ever, time to introduce G3P to other means of user intent (particularly textual problem descriptions). In this paper, we would like to assess the potential for G3P to evolve programs based on their similarity to particular target codes of interest (obtained using some code retrieval/generative approach). We particularly assess 4 similarity measures from various fields: text processing (i.e., FuzzyWuzzy), natural language processing (i.e., Cosine Similarity based on term frequency), software clone detection (i.e., CCFinder), plagiarism detector(i.e., SIM). Through our experimental evaluation on a well-known program synthesis benchmark, we have shown that G3P successfully manages to evolve some of the desired programs with three of the used similarity measures. However, in its default configuration, G3P is not as successful with similarity measures as with the classical input/output error rate at evolving solving program synthesis problems.
grammar-guided genetic programming (G3P) is widely recognised as one of the most successful approaches for program synthesis, i.e., the task of automatically discovering an executable piece of code given user intent. ...
详细信息
ISBN:
(数字)9781665467087
ISBN:
(纸本)9781665467087
grammar-guided genetic programming (G3P) is widely recognised as one of the most successful approaches for program synthesis, i.e., the task of automatically discovering an executable piece of code given user intent. G3P has been shown capable of successfully evolving programs in arbitrary languages that solve several program synthesis problems based only on a set of input/output examples. Despite its success, the restriction on the evolutionary system to only leverage input/output error rate during its assessment of the programs it derives limits its scalability to larger and more complex program synthesis problems. With the growing number and size of open software repositories and generative artificial intelligence approaches, there is a sizeable and growing number of approaches for retrieving/generating source code (potentially several partial snippets) based on textual problem descriptions. Therefore, it is now, more than ever, time to introduce G3P to other means of user intent (particularly textual problem descriptions). In this paper, we would like to assess the potential for G3P to evolve programs based on their similarity to particular target codes of interest (obtained using some code retrieval/generative approach). Through our experimental evaluation on a well-known program synthesis benchmark, we have shown that G3P successfully manages to evolve some of the desired programs with all four considered similarity measures. However, in its default configuration, G3P is not as successful with similarity measures as it is with the classical input/output error rate when solving program synthesis problems. Therefore, we propose a novel multi-objective G3P approach that combines the similarity to the target program and the traditional input/output error rate. Our experiments show that compared to the error-based G3P, the multi-objective G3P approach could improve the success rate of specific problems and has great potential to improve on the traditional G3P syste
暂无评论