Abstract
Representation learning is an area responsible for learning data representations that makes it easier for machine learning algorithms to extract useful information from them. Deep learning currently has the most effective methods for this task and can learn distributed representations-also known as embeddings-able to represent different properties of the data and their relationship. In this direction, this paper introduces a new way to look at tree-like GP individuals for symbolic regression. Given a set of predefined operators and a sufficiently large number of solutions sampled from the space, we train a transformer to learn an encoding/decoding function. By transforming a tree representation into a distributed representation, we are able to measure distances between trees in a much more efficient way and, more importantly, generate the potential for these representations to capture semantics. We show the distance accounting for embedding presents results very similar to those of a tree-edition, which reflects their syntactic similarity. Although the model as it stands is not able to capture semantics yet, we show its potential by using the generated tree-representation model in a simple task: measuring distances between trees in a fitness-sharing scenario.
Author supplied keywords
Cite
CITATION STYLE
Caetano, V., Teixeira, M. C., & Pappa, G. L. (2023). Symbolic regression trees as embedded representations. In GECCO 2023 - Proceedings of the 2023 Genetic and Evolutionary Computation Conference (pp. 411–419). Association for Computing Machinery, Inc. https://doi.org/10.1145/3583131.3590423
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.