An information-theoretic framework is presented forthe development and analysis of the ensemble learningapproach of genetic programming. As evolution proceeds,this approach suggests that the mutual informationbetween the target and models should: (i) not decreasein the population; (ii) concentrate in fewerindividuals; and (iii) be distilled from the inputs,eliminating excess entropy. Normalised informationtheoretic indices are developed to measure fitness anddiversity of ensembles, without a priori knowledge ofhow the multiple constituent models might be composedinto a single model. With the use of these indexes forreproductive and survival selection, building blocksare less likely to be lost and more likely to berecombined. Price's Theorem is generalised to pairselection, from which it follows that the heritabilityof information should be stronger than the heritabilityof error, improving evolvability. We support thesearguments with simulations using a logic functionbenchmark and a time series application. For a chaotictime series prediction problem, for instance, theproposed approach avoids familiar difficulties(premature convergence, deception, poor scaling, andearly loss of needed building blocks) with standard GPsymbolic regression systems; information-based fitnessfunctions showed strong intergenerational correlationsas required by Price's Theorem.
CITATION STYLE
Card, S. W., & Mohan, C. K. (2007). Towards an Information Theoretic Framework for Genetic Programming. In Genetic Programming Theory and Practice V (pp. 87–106). Springer US. https://doi.org/10.1007/978-0-387-76308-8_6
Mendeley helps you to discover research relevant for your work.