Analyzing PETs on imbalanced datasets when training and testing class distributions differ

David Cieslak; Nitesh Chawla

Conference Proceedings

Analyzing PETs on imbalanced datasets when training and testing class distributions differ

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5012 LNAI 519-526

DOI: 10.1007/978-3-540-68125-0_46

9Citations

16Readers

Get full text

Abstract

Many machine learning applications like finance, medicine, and risk management suffer from class imbalance: cases of interest occur rarely. Further complicating these applications is that the training and testing samples might differ significantly in their respective class distributions. Sampling has been shown to be a strong solution to imbalance and additionally offers a rich parameter space from which to select classifiers. This paper is concerned with the interaction between Probability Estimation Trees (PETs) [1], sampling, and performance metrics as testing distributions fluctuate substantially. A set of comprehensive analyses is presented, which anticipate classifier performance through a set of widely varying testing distributions. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Cieslak, D., & Chawla, N. (2008). Analyzing PETs on imbalanced datasets when training and testing class distributions differ. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5012 LNAI, pp. 519–526). https://doi.org/10.1007/978-3-540-68125-0_46

Analyzing PETs on imbalanced datasets when training and testing class distributions differ

Abstract

Cite

Register to see more suggestions