Improving decision tree performance through induction- and cluster-based stratified sampling

9Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is generally recognised that recursive partitioning, as used in the construction of classification trees, is inherently unstable, particularly for small data sets. Classification accuracy and, by implication, tree structure, are sensitive to changes in the training data. Successful approaches to counteract this effect include multiple classifiers, e.g. boosting, bagging or windowing. The downside of these multiple classification models, however, is the plethora of trees that result, often making it difficult to extract the classifier in a meaningful manner. We show that, by using some very weak knowledge in the sampling stage, when the data set is partitioned into the training and test sets, a more consistent and improved performance is achieved by a single decision tree classifier. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Gill, A. A., Smith, G. D., & Bagnall, A. J. (2004). Improving decision tree performance through induction- and cluster-based stratified sampling. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3177, 339–344. https://doi.org/10.1007/978-3-540-28651-6_50

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free