RDPVR: Random Data Partitioning with Voting Rule for Machine Learning from Class-Imbalanced Datasets

25Citations
Citations of this article
39Readers
Mendeley users who have this article in their library.

Abstract

Since most classifiers are biased toward the dominant class, class imbalance is a challenging problem in machine learning. The most popular approaches to solving this problem include oversampling minority examples and undersampling majority examples. Oversampling may increase the probability of overfitting, whereas undersampling eliminates examples that may be crucial to the learning process. We present a linear time resampling method based on random data partitioning and a majority voting rule to address both concerns, where an imbalanced dataset is partitioned into a number of small subdatasets, each of which must be class balanced. After that, a specific classifier is trained for each subdataset, and the final classification result is established by applying the majority voting rule to the results of all of the trained models. We compared the performance of the proposed method to some of the most well-known oversampling and undersampling methods, employing a range of classifiers, on 33 benchmark machine learning class-imbalanced datasets. The classification results produced by the classifiers employed on the generated data by the proposed method were comparable to most of the resampling methods tested, with the exception of SMOTEFUNA, which is an oversampling method that increases the probability of overfitting. The proposed method produced results that were comparable to the Easy Ensemble (EE) undersampling method. As a result, for solving the challenge of machine learning from class-imbalanced datasets, we advocate using either EE or our method.

References Powered by Scopus

SMOTE: Synthetic minority over-sampling technique

22940Citations
N/AReaders
Get full text

ADASYN: Adaptive synthetic sampling approach for imbalanced learning

4140Citations
N/AReaders
Get full text

Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning

3506Citations
N/AReaders
Get full text

Cited by Powered by Scopus

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

339Citations
N/AReaders
Get full text

Stop Oversampling for Class Imbalance Learning: A Review

79Citations
N/AReaders
Get full text

DeepKnuckle: Deep Learning for Finger Knuckle Print Recognition

29Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Hassanat, A. B., Tarawneh, A. S., Abed, S. S., Altarawneh, G. A., Alrashidi, M., & Alghamdi, M. (2022). RDPVR: Random Data Partitioning with Voting Rule for Machine Learning from Class-Imbalanced Datasets. Electronics (Switzerland), 11(2). https://doi.org/10.3390/electronics11020228

Readers over time

‘22‘23‘24‘2505101520

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 9

56%

Lecturer / Post doc 4

25%

Researcher 2

13%

Professor / Associate Prof. 1

6%

Readers' Discipline

Tooltip

Computer Science 10

59%

Engineering 5

29%

Energy 1

6%

Agricultural and Biological Sciences 1

6%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free
0