Abstract
Logistic regression (LR) is one of the most popular classifiers. However, LR cannot perform effectively on imbalanced data. Thereãre twoãpproaches to imbalanced data for LR, including resampling techniquesãnd modifications to the log-likelihood function. Theseãpproaches improve performance measures of LR in some cases, but their effectiveness is not robust in general. In this paper, we proposeã classifier called F-measure-oriented Lasso- Logistic Regression (F-LLR) to deal with imbalanced data. The base learner of F-LLR is Lasso-Logistic regression (LLR) which imposes the prior on the magnitude of parameters byã hyper-parameter λ. The optimal λ is determined byãnãdjustment of the cross-validation procedure whichãims for the highest F-measure instead of the highestãccuracy. F-LLRãddresses imbalanced data by the combination of Under-samplingãnd Synthetic Minority Oversampling Technique (SMOTE) selectively based on the scores of the training data. The empirical study shows that F-LLR increases F-measureãnd KSãs compared with LLRãnd the traditional balanced methods, suchãs the resampling techniques (Random Undersampling, Random Over-sampling,ãnd SMOTE)ãnd the modifications to log-likelihood function (Ridgeãnd Weighted likelihood estimation).
Author supplied keywords
Cite
CITATION STYLE
My, B. T. T., & Ta, B. Q. (2023). A modification of logistic regression with imbalanced data: F-measure-oriented Lasso-logistic regression. ScienceAsia, 49(S1), 68–77. https://doi.org/10.2306/scienceasia1513-1874.2023.s003
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.