Using active learning selection approach for cross-project software defect prediction

7Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Cross-project defect prediction (CPDP) technology can effectively ensure software quality, which plays an important role in software engineering. When encountering a newly developed project with insufficient training data, CPDP can be used to build defect predictors using other projects. However, CPDP does not take into account the prior knowledge of the target items and the class imbalance in the source item data. In this paper, we design an active learning selection algorithm for cross-project defect prediction to alleviate the above problems. First, we use clustering and active learning algorithms to filter and label some representative data from the target items and use these data as prior knowledge to guide the selection of source items. Then, the active learning algorithm is used to filter representative data from the source items. Finally, the balanced cross-item dataset is constructed using the active learning algorithm, and the defect prediction model is built. In this article, we selected 10 open-source projects by using common defect prediction models, active learning algorithms, and common evaluation metrics. The results show that the proposed algorithm can effectively filter the data, solve the class imbalance problem in cross-project data, and improve the defect prediction performance.

Cite

CITATION STYLE

APA

Mi, W., Li, Y., Wen, M., & Chen, Y. (2022). Using active learning selection approach for cross-project software defect prediction. Connection Science, 34(1), 1482–1499. https://doi.org/10.1080/09540091.2022.2077913

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free