Shattering Inequalities for Learning Optimal Decision Trees

Justin J. Boutilier; Carla Michini; Zachary Zhou

Conference Proceedings

Shattering Inequalities for Learning Optimal Decision Trees

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13292 LNCS 74-90

DOI: 10.1007/978-3-031-08011-1_7

6Citations

4Readers

Get full text

Abstract

Recently, mixed-integer programming (MIP) techniques have been applied to learn optimal decision trees. Empirical research has shown that optimal trees typically have better out-of-sample performance than heuristic approaches such as CART. However, the underlying MIP formulations often suffer from slow runtimes, due to weak linear programming (LP) relaxations. In this paper, we first propose a new MIP formulation for learning optimal decision trees with multivariate branching rules and no assumptions on the feature types. Our formulation crucially employs binary variables expressing how each observation is routed throughout the entire tree. We then introduce a new class of valid inequalities for learning optimal multivariate decision trees. Each inequality encodes an inclusion-minimal set of points that cannot be shattered by a multivariate split, and in the context of a MIP formulation, the inequalities are sparse, involving at most the number of features plus two variables. We leverage these valid inequalities within a Benders-like decomposition, where the master problem determines how to route each observation to a leaf node to minimize misclassification error, and the subproblem checks whether, for each branch node of the decision tree, it is possible to construct a multivariate split that realizes the given routing of observations; if not, the subproblem adds at least one of our valid inequalities to the master problem. We demonstrate through numerical experiments that our MIP approach outperforms (in terms of training accuracy, testing accuracy, solution time, and relative gap) two other popular MIP formulations, and is able to improve both in and out-of-sample performance, while remaining competitive in terms of solution time to a wide range of popular approaches from the literature.

Author supplied keywords

Cite

CITATION STYLE

APA

Boutilier, J. J., Michini, C., & Zhou, Z. (2022). Shattering Inequalities for Learning Optimal Decision Trees. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13292 LNCS, pp. 74–90). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-08011-1_7

Shattering Inequalities for Learning Optimal Decision Trees

Abstract

Author supplied keywords

Cite

Register to see more suggestions