The paper presents a novel approach for the resampling of imbalanced datasets aiming at the improvement of classifiers performance. The method exploits two self–organizing–maps for the determinations of the clusters of majority and minority data. Clusters centroids are used to select the samples whose under–sampling or over–sampling is more convenient while the optimal resampling rates are determined through a genetic algorithm that maximizes the classifier performance. The algorithm is tested on several datasets coming from both the UCI repository and real industrial applications and compared to other widely used resampling methods.
CITATION STYLE
Vannucci, M., & Colla, V. (2019). Imbalanced datasets resampling through self organizing maps and genetic algorithms. In Communications in Computer and Information Science (Vol. 1000, pp. 399–411). Springer Verlag. https://doi.org/10.1007/978-3-030-20257-6_34
Mendeley helps you to discover research relevant for your work.