Selecting the training set in classification problems with rare events

1Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Binary classification algorithms are often used in situations when one of the two classes is extremely rare. A common practice is to oversample units of the rare class when forming the training set. For some classification algorithms, like logistic classification, there are theoretical results that justify such an approach. Similar results are not available for other popular classification algorithms like classification trees. In this paper the use of balanced datasets, when dealing with rare classes, for tree classifiers and boosting algorithms is discussed and results from analyzing a real dataset and a simulated dataset are reported.

Cite

CITATION STYLE

APA

Scarpa, B., & Torelli, N. (2005). Selecting the training set in classification problems with rare events. In Studies in Classification, Data Analysis, and Knowledge Organization (Vol. 0, pp. 39–46). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/3-540-27373-5_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free