Using class imbalance learning for cross-company defect prediction

7Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

Abstract

Cross-company defect prediction (CCDP) is a practical way that trains a prediction model by exploiting one or multiple projects of a source company and then applies the model to target company. Unfortunately, the performance of such CCDP models is susceptible to the high imbalanced nature between the defect-prone and non-defect classes of CC data. Class imbalance learning is applied to alleviate this issue. Because many class imbalance learning methods have been proposed, there is an imperative need to analyze and compare the performance of these methods for CCDP. Although prior empirical studies have proven AdaBoost.NC algorithm achieves the best performance for defect prediction. This observation leads us to conduct a careful empirical study the issues of if and how class imbalance learning methods can benefit cross-company defect prediction. We investigate different types of class imbalance learning methods, including under-sampling technique, over-sampling technique and over sampling followed by under-sampling technique on the cross-company defect prediction performance over 15 publicly available datasets. Experimental results show that under-sampling technique achieves the best overall performance in terms of the gmeasure among those methods we studied.1.

Cite

CITATION STYLE

APA

Yu, X., Zhou, M., Chen, X., Deng, L., & Wang, L. (2017). Using class imbalance learning for cross-company defect prediction. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (pp. 117–122). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2017-035

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free