Decision trees: More theoretical justification for practical algorithms

Amos Fiat; Dmitry Pechyony

Conference Proceedings

Decision trees: More theoretical justification for practical algorithms

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (2004) 3244 156-170

DOI: 10.1007/978-3-540-30215-5_13

15Citations

7Readers

Get full text

Abstract

We study impurity-based decision tree algorithms such as CART, C4.5, etc., so as to better understand their theoretical underpinnings. We consider such algorithms on special forms of functions and distributions. We deal with the uniform distribution and functions that can be described as a boolean linear threshold functions or a read-once DNF. We show that for boolean linear threshold functions and read-once DNF, maximal purity gain and maximal influence are logically equivalent. This leads us to the exact identification of these classes of functions by impurity-based algorithms given sufficiently many noise-free examples. We show that the decision tree resulting from these algorithms has minimal size and height amongst all decision trees representing the function. Based on the statistical query learning model, we introduce the noise-tolerant version of practical decision tree algorithms. We show that if the input examples have small classification noise and are uniformly distributed, then all our results for practical noise-free impurity-based algorithms also hold for their noise-tolerant version. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Fiat, A., & Pechyony, D. (2004). Decision trees: More theoretical justification for practical algorithms. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3244, pp. 156–170). Springer Verlag. https://doi.org/10.1007/978-3-540-30215-5_13

Decision trees: More theoretical justification for practical algorithms

Abstract

Cite

Register to see more suggestions