A memetic approach for training set selection in imbalanced data sets

7Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Imbalanced data classification is a challenging problem in the field of machine learning. The problem occurs when data samples have an uneven distribution amongst the classes and classical classifiers are not suitable for classifying such datasets. To overcome this problem, in this paper, the best training samples are selected from data samples with the goal of improving the performance of classifier when dealing with imbalanced data. To do so, some heuristic methods are presented which use local information to give a proper view about whether removing or retaining each sample of training set. Subsequently, the methods are considered as local search algorithms and combined with a global search algorithm in a framework to form memetic algorithms. The global search used in this paper is binary quantum inspired gravitational search algorithm (BQIGSA) which is a new metaheuristic search for optimization of binary encoded problems. BQIGSA is employed since we seek for a highly stochastic and random search algorithm to solve our problem. We propose to use six different local search algorithms, three of which are application oriented that we designed based on the problem and the rest are general, and the best local search is then determined. Experiments are performed on 45 standard datasets, and G-mean and AUC criteria are considered as evaluation tools. Then, the data sets are employed to compare the best memetic approaches with some popular state of the art algorithms as well as a recently proposed memetic algorithm and the results show their superiority. At the last step, the performance of the proposed algorithm for four different classifiers is evaluated and the best classifier is determined to be utilized for this method.

Cite

CITATION STYLE

APA

Nikpour, B., & Nezamabadi-pour, H. (2019). A memetic approach for training set selection in imbalanced data sets. International Journal of Machine Learning and Cybernetics, 10(11), 3043–3070. https://doi.org/10.1007/s13042-019-01000-w

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free