Design Thinking for Class Imbalance Problems Using Compound Techniques

0Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

One of most commonly occurring phenomena in ML methods is that of class imbalance, wherein one class dominates the entire class distribution in terms of frequency. Over a period of time, many methods have been proposed to deal with the issue of class imbalance. Primarily, these methods to address class imbalance can be done through either one of the combinations of the following:1.Sampling Procedure2.Algorithms This paper analyzes the design options across both the available options in combination and provides guidance on suitable Algorithms to be used for a minority class scenario in conjunction with the right sampling technique to improve the accuracy of prediction. While the results have been obtained with an objective to optimize F1 score, the paper also analyzes the pattern of Precision and Recall values with respect to each of the algorithms under various sampling techniques. Later, the paper also explores a few loss functions for tree-based algorithms and their corresponding variations to validation measures. We use the open-source Credit Card Fraud dataset, hosted on Kaggle [1]. We will use F1 metric as the model evaluation criteria, as it is generally suited to class imbalance problems and captures trade-off between precision and recall [2]. We do report our insights on precision and recall as well in this study. We also explore a more semi-supervised approach using Autoencoders and evaluate performance with other (more traditional) Machine Learning methods. The credit card fraud dataset used here contains transactions made with credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in 2 days, where there are 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) accounts for only 0.172% of all transactions.

Cite

CITATION STYLE

APA

Tiwari, R., Sen, A., & Dey, K. (2021). Design Thinking for Class Imbalance Problems Using Compound Techniques. In Advances in Intelligent Systems and Computing (Vol. 1175, pp. 379–400). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-5619-7_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free