An Approach to Imbalanced Data Sets Based on Changing Rule Strength

Jerzy W. Grzymala-Busse; Linda K. Goodwin; Witold J. Grzymala-Busse; Xinqun Zheng

Book Chapter

An Approach to Imbalanced Data Sets Based on Changing Rule Strength

Grzymala-Busse J
Goodwin L
Grzymala-Busse W
et al.

DOI: 10.1007/978-3-642-18859-6_21

N/ACitations

19Readers

Get full text

Abstract

This paper describes experiments with a challenging data set describing preterm births. The data set, collected at the Duke University Medical Center, was large and, at the same time, many attribute values were missing. However, the main problem was that only 20.7% of the total number of cases represented the important preterm birth class. Thus the data set was imbalanced. For comparison, we include results of experiments on another imbalanced data set, the well- known breast cancer data set. Our approach to dealing with this imbalanced data set was to induce a rule set using our standard procedure: the LEM2 algorithm of the LERS rule induction system and then increase the rule strength for all rules describing preterm births by multiplying all such rule strengths by the same number called a strength multiplier. The rules strength for any rule describing the majority class, fullterm birth, remained unchanged. The optimal strength multiplier was determined experimentally using our optimality criterion: the maximum of the sum of sensitivity and specificity. Introduction

Cite

CITATION STYLE

APA

Grzymala-Busse, J. W., Goodwin, L. K., Grzymala-Busse, W. J., & Zheng, X. (2004). An Approach to Imbalanced Data Sets Based on Changing Rule Strength (pp. 543–553). https://doi.org/10.1007/978-3-642-18859-6_21

An Approach to Imbalanced Data Sets Based on Changing Rule Strength

Abstract

Cite

Register to see more suggestions