We propose a novel methodology for binary and multiclass classification that uses genetic programming to construct features for a nearest centroid classifier. The method, coined M4GP, improves upon earlier approaches in this vein (M2GP and M3GP) by simplifying the program encoding, using advanced selection methods, and archiving solutions during the run. In our recent paper, we test this stategy against traditional GP formulations of the classification problem, showing that this framework outperforms boolean and floating point encodings. In comparison to several machine learning techniques, M4GP achieves the best overall ranking on benchmark problems. We then compare our algorithm against state-ofthe-art machine learning approaches to the task of disease classification using simulated genetics datasets with up to 5000 features. The results suggest that our proposed approach performs on par with the best results in literature with less computation time, while producing simpler models.
CITATION STYLE
Cava, W. L., Spector, L., Silva, S., Vanneschi, L., Danai, K., & Moore, J. H. (2018). A multidimensional genetic programming approach for identifying epsistatic gene interactions. In GECCO 2018 Companion - Proceedings of the 2018 Genetic and Evolutionary Computation Conference Companion (pp. 23–24). Association for Computing Machinery, Inc. https://doi.org/10.1145/3205651.3208217
Mendeley helps you to discover research relevant for your work.