Empirical Study on the Distribution of Bugs in Software Systems

6Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Many research studies in the past have shown that the distribution of bugs in software systems follows the Pareto principle. Some studies have also proposed the Pareto distribution (PD) to model bugs in software systems. However, several other probability distributions such as the Weibull, Bounded Generalized Pareto, Double Pareto (DP), Log Normal and Yule-Simon distributions have also been proposed and each of them has been evaluated for their fitness to model bugs in different studies. We investigate this problem further by making use of information theoretic (criterion-based) approaches to model selection by which several issues like overfitting, etc., that are prevalent in previous works, can be handled elegantly. By strengthening the model selection procedure and studying a large collection of fault data, the results are made more accurate and stable. We conduct experiments on fault data from 74 releases of various open source and proprietary software systems and observe that the DP distribution outperforms all others with statistical significance in the case of proprietary projects. For open source software systems, the top three performing distributions are DP, Bounded Generalized Pareto, Weibull models and they are significantly better than all others though there is no significant difference amongst three of them.

Cite

CITATION STYLE

APA

Shriram, C. K., Muthukumaran, K., & Bhanu Murthy, N. L. (2018). Empirical Study on the Distribution of Bugs in Software Systems. International Journal of Software Engineering and Knowledge Engineering, 28(1), 97–122. https://doi.org/10.1142/S0218194018500055

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free