A Survey Study on Proposed Solutions for Imbalanced Big Data

Shaymaa Ahmed Razoqi; Ghayda A.A. Al-Talib

Journal ArticleOPEN ACCESS

A Survey Study on Proposed Solutions for Imbalanced Big Data

Iraqi Journal of Science (2024) 65(3) 1648-1662

DOI: 10.24996/ijs.2024.65.3.37

2Citations

17Readers

Abstract

Learning from imbalanced data has been a focus of studies for more than two decades of continuous development. Training data is considered imbalanced when the size of the positive (minority) class is neglected because of the large size of the negative (majority) class, in addition to the problem of deviating distributions of binary tasks. The appearance of big data brings new problems and challenges to the imbalance problem. Big Data announces the challenges with 5V: volume, velocity, veracity, value, and variety. This study relied on dividing the solution to the problem of data imbalance into three levels: data level, algorithm level, and hybrid approaches. First, the standard solutions for this problem that were proposed were mentioned, and in addition, the most important metrics adopted for measuring the classification efficiency of imbalanced data were identified. In this survey study, 27 studies were reviewed during the period 2015–2022, distributed according to the levels of treatment of the imbalance problem. They also reviewed the performance metrics that were used in these studies and the sources of the datasets to which these solutions were applied. The study makes it easier for researchers and scholars to see the solutions to addressing the problem of data imbalance and the hybrid approaches recently used for that, and to take advantage of them in improving the classification process.

Author supplied keywords

Cite

CITATION STYLE

APA

Razoqi, S. A., & Al-Talib, G. A. A. (2024). A Survey Study on Proposed Solutions for Imbalanced Big Data. Iraqi Journal of Science, 65(3), 1648–1662. https://doi.org/10.24996/ijs.2024.65.3.37

A Survey Study on Proposed Solutions for Imbalanced Big Data

Abstract

Author supplied keywords

Cite

Register to see more suggestions