Decision trees (DTs) are popular techniques in the field of eXplainable Artificial Intelligence. Despite their effectiveness in solving various classification problems, they are not compatible with modern biological data generated with high-throughput technologies. This work aims to combine evolutionary induced DT with a recently developed concept designed directly for gene expression data, called Relative eXpression Analysis (RXA). We propose a new solution, termed Evolutionary Heterogeneous Decision Tree (EvoHDTree), which uses both classical univariate and bivariate tests that focus on the relative ordering and weight relationships between the genes in the splitting nodes. The search for the decision tree structure, node representation, and splits is performed globally by the evolutionary algorithm. To meet the huge computational demands, we enriched our solution with more than a dozen specialized variants of recombination operators, GPU-computed local search components, OpenMP parallelization, and built-in gene ranking to improve evolutionary convergence. Experiments performed on cancer-related gene expression-based data show that the proposed solution finds accurate and much simpler interactions between genes. Importantly, the patterns discovered by EvoHDTree are easy to understand and to some extent supported by biological evidence in the literature.
CITATION STYLE
Czajkowski, M., Jurczuk, K., & Kretowski, M. (2021). Accelerated evolutionary induction of heterogeneous decision trees for gene expression-based classification. In GECCO 2021 - Proceedings of the 2021 Genetic and Evolutionary Computation Conference (pp. 946–954). Association for Computing Machinery, Inc. https://doi.org/10.1145/3449639.3459376
Mendeley helps you to discover research relevant for your work.