In data mining, popular model ensemble technique like boosting is often used to improve predictive models performance. When mining data with rare events (far less than 5%), though boosting may improve a model's overall prediction power, but the accuracy and efficiency of model estimation is negatively impacted when the simple random sampling procedure is employed. In this study we investigate the performance of applying the boosting technique to an imbalanced sample procedure called case-based sampling. We demonstrate the performance of the combined procedure in predicting customer attrition with an actual telecommunications data. Our results show that the combination of boosting and case-based sampling is very effective at alleviating the problem of rare events. © 2010 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Au, T., Chin, M. L. I., & Ma, G. (2010). Mining rare events data by sampling and boosting: A case study. Communications in Computer and Information Science, 54, 373–379. https://doi.org/10.1007/978-3-642-12035-0_38
Mendeley helps you to discover research relevant for your work.