Software change-proneness prediction, which predicts whether or not class files in a project will be changed in their next release, can help software developers allocate resources more effectively and reduce software maintenance costs. Previous studies found that change-proneness prediction cannot work well with limited training data, especially for new projects. To address this issue, the cross-project change-proneness prediction is proposed, which builds a prediction model by using sufficient data form other projects, i.e. the source projects, and predicts the change-prone files in a target project. However, the cross-project prediction is unstable due to the large metric distinction between source projects, leading to a challenge for classifying change-prone files. To improve the cross-project prediction, we propose a Deep Metric Learning (DML) model to minimize such feature distinction before the file classification. Specifically, DML maps files in source projects into a particular space, where files from the same category, e.g. change-prone files, are getting closer while files from different categories are getting further. Besides, we also leverage an over-sampling approach to handle the highly imbalanced dataset for model training. We verify our model on 20 change-proneness datasets, and compare it with 5 cross-project change-proneness models. Results indicate that the proposed model can substantially improve the performance of change-proneness prediction.
CITATION STYLE
Ge, Y., Chen, M., Liu, C., Chen, F., Huang, S., & Wang, H. (2018). Deep metric learning for software change-proneness prediction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11266 LNCS, pp. 287–300). Springer Verlag. https://doi.org/10.1007/978-3-030-02698-1_25
Mendeley helps you to discover research relevant for your work.