An Approach to Imbalanced Data Sets Based on Changing Rule Strength

  • Grzymala-Busse J
  • Goodwin L
  • Grzymala-Busse W
  • et al.
N/ACitations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes experiments with a challenging data set describing preterm births. The data set, collected at the Duke University Medical Center, was large and, at the same time, many attribute values were missing. However, the main problem was that only 20.7% of the total number of cases represented the important preterm birth class. Thus the data set was imbalanced. For comparison, we include results of experiments on another imbalanced data set, the well- known breast cancer data set. Our approach to dealing with this imbalanced data set was to induce a rule set using our standard procedure: the LEM2 algorithm of the LERS rule induction system and then increase the rule strength for all rules describing preterm births by multiplying all such rule strengths by the same number called a strength multiplier. The rules strength for any rule describing the majority class, fullterm birth, remained unchanged. The optimal strength multiplier was determined experimentally using our optimality criterion: the maximum of the sum of sensitivity and specificity. Introduction

Cite

CITATION STYLE

APA

Grzymala-Busse, J. W., Goodwin, L. K., Grzymala-Busse, W. J., & Zheng, X. (2004). An Approach to Imbalanced Data Sets Based on Changing Rule Strength (pp. 543–553). https://doi.org/10.1007/978-3-642-18859-6_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free