A Relabeling Approach to Handling the Class Imbalance Problem for Logistic Regression

8Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Logistic regression is a standard procedure for real-world classification problems. The challenge of class imbalance arises in two-class classification problems when the minority class is observed much less than the majority class. This characteristic is endemic in many domains. Work by Owen has shown that cluster structure among the minority class may be a specific problem in highly imbalanced logistic regression. In this article, we propose a novel relabeling approach to handle the class imbalance problem when using logistic regression, which essentially assigns new labels to the minority class observations. An expectation–maximization algorithm is formalized to serve as a tool for efficiently computing this relabeling. Modeling on such relabeled data can lead to improved predictive performance. We demonstrate the effectiveness of this approach with detailed experiments on real datasets. Supplemental materials for the article are available online.

Cite

CITATION STYLE

APA

Li, Y., Adams, N., & Bellotti, T. (2022). A Relabeling Approach to Handling the Class Imbalance Problem for Logistic Regression. Journal of Computational and Graphical Statistics, 31(1), 241–253. https://doi.org/10.1080/10618600.2021.1978470

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free