Impact of imbalanced data on the performance of software defect prediction classifiers

0Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Software defect prediction plays an important role in analysing software quality and balancing software cost. However, it lacks suggestions for project managers and software engineers in selecting classifiers. Firstly, a method for building imbalanced distribution data is proposed. Then, Matthews correlation coefficient is used to measure the performance of different classifiers, and the coefficient of variation is utilised to evaluate the stability of classifiers on imbalanced distribution data. Finally, an experiment is conducted on 8 common classifiers and 12 publicly available and widely used data sets. Results show that NaiveBayes behaves steadily when the imbalance rate of data sets changes significantly. The experimental results provide a basis for project managers and software engineers to select classifiers.

Cite

CITATION STYLE

APA

Wang, L., Wang, W., Liu, B., & Geng, S. (2019). Impact of imbalanced data on the performance of software defect prediction classifiers. In Journal of Physics: Conference Series (Vol. 1345). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1345/2/022026

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free