Knowledge Discovery in Databases

  • Beekmann F
N/ACitations
Citations of this article
55Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Many real world data mining applications involve learning from imbalanced data sets. Learning from data sets that contain very few instances of the minority (or interesting) class usually produces biased classifiers that have a higher predictive accuracy over the majority class(es), but poorer predictive accuracy over the minority class. SMOTE (Synthetic Minority Over-sampling TEchnique) is specifically designed for learning from imbalanced data sets. This paper presents a novel approach for learning from imbalanced data sets, based on a combination of the SMOTE algorithm and the boosting procedure. Unlike standard boosting where all misclassified examples are given equal weights, SMOTEBoost creates synthetic examples from the rare or minority class, thus indirectly changing the updating weights and compensating for skewed distributions. SMOTEBoost applied to several highly and moderately imbalanced data sets shows improvement in prediction performance on the minority class and overall improved F-values.

Cite

CITATION STYLE

APA

Beekmann, F. (2003). Knowledge Discovery in Databases. In Stichprobenbasierte Assoziationsanalyse im Rahmen des Knowledge Discovery in Databases (pp. 5–50). Deutscher Universitätsverlag. https://doi.org/10.1007/978-3-322-81227-8_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free