Ensembles of Bagged TAO Trees Consistently Improve over Random Forests, AdaBoost and Gradient Boosting

Miguel A. Carreira-Perpiñán; Arman Zharmagambetov

Conference ProceedingsOPEN ACCESS

Ensembles of Bagged TAO Trees Consistently Improve over Random Forests, AdaBoost and Gradient Boosting

FODS 2020 - Proceedings of the 2020 ACM-IMS Foundations of Data Science Conference (2020) 35-46

DOI: 10.1145/3412815.3416882

36Citations

31Readers

Get full text

Abstract

Ensemble methods based on trees, such as Random Forests, AdaBoost and gradient boosting, are widely recognized as among the best off-the-shelf classifiers: they typically achieve state-of-the-art accuracy in many problems with little effort in tuning hyperparameters, and they are often used in applications, possibly combined with other methods such as neural nets. While many variations of forest methods exist, using different diversity mechanisms (such as bagging, feature sampling or boosting), nearly all rely on training individual trees in a highly suboptimal way using greedy top-down tree induction algorithms such as CART or C5.0. We study forests where each tree is trained on a bootstrapped or random sample but using the recently proposed tree alternating optimization (TAO), which is able to learn trees that have both fewer nodes and lower error. The better optimization of individual trees translates into forests that achieve higher accuracy but using fewer, smaller trees with oblique nodes. We demonstrate this in a range of datasets and with a careful study of the complementary effect of optimization and diversity in the construction of the forest. These bagged TAO trees improve consistently and by a considerable margin over Random Forests, AdaBoost, gradient boosting and other forest algorithms in every single dataset we tried.

Author supplied keywords

Cite

CITATION STYLE

APA

Carreira-Perpiñán, M. A., & Zharmagambetov, A. (2020). Ensembles of Bagged TAO Trees Consistently Improve over Random Forests, AdaBoost and Gradient Boosting. In FODS 2020 - Proceedings of the 2020 ACM-IMS Foundations of Data Science Conference (pp. 35–46). Association for Computing Machinery, Inc. https://doi.org/10.1145/3412815.3416882

Ensembles of Bagged TAO Trees Consistently Improve over Random Forests, AdaBoost and Gradient Boosting

Abstract

Author supplied keywords

Cite

Register to see more suggestions