Improving software-quality predictions with data sampling and boosting

94Citations
Citations of this article
142Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Software-quality data sets tend to fall victim to the class-imbalance problem that plagues so many other application domains. The majority of faults in a software system, particularly high-assurance systems, usually lie in a very small percentage of the software modules. This imbalance between the number of fault-prone (fp) and non-fp (nfp) modules can have a severely negative impact on a data-mining technique's ability to differentiate between the two. This paper addresses the classimbalance problem as it pertains to the domain of software-quality prediction. We present a comprehensive empirical study examining two different methodologies, data sampling and boosting, for improving the performance of decision-tree models designed to identify fp software modules. This paper applies five datasampling techniques and boosting to 15 software-quality data sets of different sizes and levels of imbalance. Nearly 50 000 models were built for the experiments contained in this paper. Our results show that while data-sampling techniques are very effective in improving the performance of such models, boosting almost always outperforms even the best data-sampling techniques. This significant result, which, to our knowledge, has not been previously reported, has important consequences for practitioners developing software-quality classification models. © 2009 IEEE.

Cite

CITATION STYLE

APA

Seiffert, C., Khoshgoftaar, T. M., & Van Hulse, J. (2009). Improving software-quality predictions with data sampling and boosting. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans, 39(6), 1283–1294. https://doi.org/10.1109/TSMCA.2009.2027131

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free