Correlated quality metrics extracted from a source code repository can be utilized to design a model to automatically predict defects in a software system. It is obvious that the extracted metrics will result in a highly unbalanced data, since the number of defects in a good quality software system should be far less than the number of normal instances. It is also a fact that the selection of the best discriminating features significantly improves the robustness and accuracy of a prediction model. Therefore, the contribution of this paper is twofold, first it selects the best discriminating features that help in accurately predicting a defect in a software component. Secondly, a cost-sensitive logistic regression and decision tree ensemble-based prediction models are applied to the best discriminating features for precisely predicting a defect in a software component. The proposed models are compared with the most recent schemes in the literature in terms of accuracy, area under the curve, and recall. The models are evaluated using 11 datasets and it is evident from the results and analysis that the performance of the proposed prediction models outperforms the schemes in the literature.
CITATION STYLE
Ali, A., Khan, N., Abu-Tair, M., Noppen, J., McClean, S., & McChesney, I. (2021). Discriminating features-based cost-sensitive approach for software defect prediction. Automated Software Engineering, 28(2). https://doi.org/10.1007/s10515-021-00289-8
Mendeley helps you to discover research relevant for your work.