In the last years, different approaches have been proposed to introduce semantic information to genetic programming. In particular, the geometric semantic genetic programming (GSGP) and the interesting properties of its evolutionary operators have gotten the attention of the community. This paper is interested in the use of GSGP to solve symbolic regression problems, where semantics is defined by the output set generated by a given individual when applied to the training cases. In this scenario, both mutation and crossover operators defined with fitness function based on Manhattan distance use randomly built functions to generate offspring. However, the outputs of these random functions are not guaranteed to be uniformly distributed in the semantic space, as the functions are generated considering the syntactic space. We hypothesize that the non-uniformity of the semantics of these functions may bias the search, and propose three different standard normalization techniques to improve the distribution of the outputs of these random functions over the semantic space. The results are compared with a popular strategy that uses a logistic function as a wrapper to the outputs, and show that the strategies tested can improve the results of the previous method. The experimental analysis also indicates that a more uniform distribution of the semantics of these functions does not necessarily imply in better results in terms of test error.
CITATION STYLE
Oliveira, L. O. V. B., Casadei, F., & Pappa, G. L. (2017). Strategies for improving the distribution of random function outputs in GSGP. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10196 LNCS, pp. 164–177). Springer Verlag. https://doi.org/10.1007/978-3-319-55696-3_11
Mendeley helps you to discover research relevant for your work.