Pruning decision trees via max-heap projection

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The decision tree model has gained great popularity both in academia and industry due to its capability of learning highly non-linear decision boundaries, and at the same time, still preserving interpretability that usually translates into transparency of decision-making. However, it has been a longstanding challenge for learning robust decision tree models since the learning process is usually sensitive to data and many existing tree learning algorithms lead to overfitted tree structures due to the heuristic and greedy nature of these algorithms. Pruning is usually needed as an ad-hoc procedure to prune the tree structure, which is, however, not guided by a rigorous optimization formulation but by some intuitive statistical justification. Motivated by recent developments in sparse learning, in this paper, we propose a novel formulation that recognizes an interesting connection between decision tree post-pruning and sparse learning, where the tree structure can be embedded as constraints in the sparse learning framework via the use of a maxheap constraint as well as a sparsity constraint. This novel formulation leads to a non-convex optimization problem which can be solved by an iterative shrinkage algorithm in which the proximal operator can be solved by an efficient max-heap projection algorithm. A stability selection method is further proposed for enabling robust model selection in practice and guarantees the selected nodes preserve tree structure. Extensive experimental results demonstrate that our proposed method achieves better predictive performance than many existing benchmark methods across a wide range of real-world datasets.

Cite

CITATION STYLE

APA

Nie, Z., Lin, B., Huang, S., Ramakrishnan, N., Fan, W., & Ye, J. (2017). Pruning decision trees via max-heap projection. In Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 (pp. 10–18). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611974973.2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free