In most of data mining systems decision trees are induced in a top-down manner. This greedy method is fast but can fail for certain classification problems. As an alternative a global approach based on evolutionary algorithms (EAs) can be applied. We developed Global Decision Tree (GDT) system, which learns a tree structure and tests in one run of the EA. Specialized genetic operators are used, which allow the system to exchange parts of trees, generate new sub-trees, prune existing ones as well as change the node type and the tests. The system is able to induce univariate, oblique and mixed decision trees. In the paper, we investigate how the GDT system can profit from a parallelization on a compute cluster. Both parallel implementation and distributed version of the induction are considered and significant speedups are obtained. Preliminary experimental results show that at least for certain problems the distributed version of the GDT system is more accurate than its panmictic predecessor. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Krȩtowski, M., & Popczyński, P. (2008). Global induction of decision trees: From parallel implementation to distributed evolution. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5097 LNAI, pp. 426–437). https://doi.org/10.1007/978-3-540-69731-2_42
Mendeley helps you to discover research relevant for your work.