Mining rare events data by sampling and boosting: A case study

Tom Au; Meei Ling Ivy Chin; Guangqin Ma

Journal Article

Mining rare events data by sampling and boosting: A case study

Communications in Computer and Information Science (2010) 54 373-379

DOI: 10.1007/978-3-642-12035-0_38

8Citations

7Readers

Get full text

Abstract

In data mining, popular model ensemble technique like boosting is often used to improve predictive models performance. When mining data with rare events (far less than 5%), though boosting may improve a model's overall prediction power, but the accuracy and efficiency of model estimation is negatively impacted when the simple random sampling procedure is employed. In this study we investigate the performance of applying the boosting technique to an imbalanced sample procedure called case-based sampling. We demonstrate the performance of the combined procedure in predicting customer attrition with an actual telecommunications data. Our results show that the combination of boosting and case-based sampling is very effective at alleviating the problem of rare events. © 2010 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Au, T., Chin, M. L. I., & Ma, G. (2010). Mining rare events data by sampling and boosting: A case study. Communications in Computer and Information Science, 54, 373–379. https://doi.org/10.1007/978-3-642-12035-0_38

Mining rare events data by sampling and boosting: A case study

Abstract

Author supplied keywords

Cite

Register to see more suggestions