Many machine learning applications like finance, medicine, and risk management suffer from class imbalance: cases of interest occur rarely. Further complicating these applications is that the training and testing samples might differ significantly in their respective class distributions. Sampling has been shown to be a strong solution to imbalance and additionally offers a rich parameter space from which to select classifiers. This paper is concerned with the interaction between Probability Estimation Trees (PETs) [1], sampling, and performance metrics as testing distributions fluctuate substantially. A set of comprehensive analyses is presented, which anticipate classifier performance through a set of widely varying testing distributions. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Cieslak, D., & Chawla, N. (2008). Analyzing PETs on imbalanced datasets when training and testing class distributions differ. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5012 LNAI, pp. 519–526). https://doi.org/10.1007/978-3-540-68125-0_46
Mendeley helps you to discover research relevant for your work.