Abstract
Cross-company defect prediction (CCDP) is a practical way that trains a prediction model by exploiting one or multiple projects of a source company and then applies the model to target company. Unfortunately, the performance of such CCDP models is susceptible to the high imbalanced nature between the defect-prone and non-defect classes of CC data. Class imbalance learning is applied to alleviate this issue. Because many class imbalance learning methods have been proposed, there is an imperative need to analyze and compare the performance of these methods for CCDP. Although prior empirical studies have proven AdaBoost.NC algorithm achieves the best performance for defect prediction. This observation leads us to conduct a careful empirical study the issues of if and how class imbalance learning methods can benefit cross-company defect prediction. We investigate different types of class imbalance learning methods, including under-sampling technique, over-sampling technique and over sampling followed by under-sampling technique on the cross-company defect prediction performance over 15 publicly available datasets. Experimental results show that under-sampling technique achieves the best overall performance in terms of the gmeasure among those methods we studied.1.
Author supplied keywords
Cite
CITATION STYLE
Yu, X., Zhou, M., Chen, X., Deng, L., & Wang, L. (2017). Using class imbalance learning for cross-company defect prediction. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (pp. 117–122). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2017-035
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.