Decision trees: More theoretical justification for practical algorithms

15Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We study impurity-based decision tree algorithms such as CART, C4.5, etc., so as to better understand their theoretical underpinnings. We consider such algorithms on special forms of functions and distributions. We deal with the uniform distribution and functions that can be described as a boolean linear threshold functions or a read-once DNF. We show that for boolean linear threshold functions and read-once DNF, maximal purity gain and maximal influence are logically equivalent. This leads us to the exact identification of these classes of functions by impurity-based algorithms given sufficiently many noise-free examples. We show that the decision tree resulting from these algorithms has minimal size and height amongst all decision trees representing the function. Based on the statistical query learning model, we introduce the noise-tolerant version of practical decision tree algorithms. We show that if the input examples have small classification noise and are uniformly distributed, then all our results for practical noise-free impurity-based algorithms also hold for their noise-tolerant version. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Fiat, A., & Pechyony, D. (2004). Decision trees: More theoretical justification for practical algorithms. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3244, pp. 156–170). Springer Verlag. https://doi.org/10.1007/978-3-540-30215-5_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free