Classifying very-high-dimensional data with random forests of oblique decision trees

33Citations
Citations of this article
35Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The random forests method is one of the most successful ensemble methods. However, random forests do not have high performance when dealing with very-high-dimensional data in presence of dependencies. In this case one can expect that there exist many combinations between the variables and unfortunately the usual random forests method does not effectively exploit this situation. We here investigate a new approach for supervised classification with a huge number of numerical attributes. We propose a random oblique decision trees method. It consists of randomly choosing a subset of predictive attributes and it uses SVM as a split function of these attributes.We compare, on 25 datasets, the effectiveness with classical measures (e.g. precision, recall, F1-measure and accuracy) of random forests of random oblique decision trees with SVMs and random forests of C4.5. Our proposal has significant better performance on very-high-dimensional datasets with slightly better results on lower dimensional datasets. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Do, T. N., Lenca, P., Lallich, S., & Pham, N. K. (2010). Classifying very-high-dimensional data with random forests of oblique decision trees. In Studies in Computational Intelligence (Vol. 292, pp. 39–55). https://doi.org/10.1007/978-3-642-00580-0_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free