Heterogeneous defect prediction via exploiting correlation subspace

28Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

Abstract

Software defect prediction generally builds models from intra-project data. Lack of training data at the early stage of software testing limits the efficiency of prediction in practice. Thereby researchers proposed cross-project defect prediction using the data from other projects. Most previous efforts assumed the cross-project defect data have the same metrics set which means the metrics used and size of metrics set are same in the data of projects. However, in real scenarios, this assumption may not hold. In addition, software defect datasets have the class imbalance problem increasing the difficulty for the learner to predict defects. In this paper, we advance canonical correlation analysis for deriving a joint feature space for associating crossproject data and propose a novel support vector machine algorithm which incorporates the correlation transfer information into classifier design for cross-project prediction. Moreover, we take different misclassification costs into consideration to make the classification inclining to classify a module as a defective one, alleviating the impact of imbalanced data. Experiments on public heterogeneous datasets from different projects show that our method is more effective, compared to state-of-the-art methods.

Cite

CITATION STYLE

APA

Cheng, M., Wu, G., Jiang, M., Wan, H., You, G., & Yuan, M. (2016). Heterogeneous defect prediction via exploiting correlation subspace. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (Vol. 2016-January, pp. 171–176). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2016-090

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free