Sampling methods are a direct approach to tackle the problem of class imbalance. These methods sample a data set in order to alter the class distributions. Usually these methods are applied to obtain a more balanced distribution. An open-ended question about sampling methods is which distribution can provide the best results, if any. In this work we develop a broad empirical study aiming to provide more insights into this question. Our results suggest that altering the class distribution can improve the classification performance of classifiers considering AUC as a performance metric. Furthermore, as a general recommendation, random over-sampling to balance distribution is a good starting point in order to deal with class imbalance. © 2008 International Federation for Information Processing.
CITATION STYLE
Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2008). A study with class imbalance and random sampling for a decision tree learning system. In IFIP International Federation for Information Processing (Vol. 276, pp. 131–140). https://doi.org/10.1007/978-0-387-09695-7_13
Mendeley helps you to discover research relevant for your work.