Most researchers conduct defect detection under the assumption that the training and future test data must be in the same feature space and the same distribution. However, in the practical applications, data sets come from different domains and different distributions. Sometimes, local data in the target projects are limited and data are usually affected by noise. In these cases, the performance of the software defect detection model is uncertain. Firstly, we introduce the data complexity concept into the software engineering from data mining field. Secondly, we investigate the data complexity measurement on public software data sets to find out which complexity metric is appropriate to apply in defect detection. Finally, we analyze the relationship between complexity metrics and model performance to gain valuable insight into the effects of data complexity on defect detection. We are optimistic that our method can provide decisionmaking support for detection model management and design.
CITATION STYLE
Ma, Y., Li, Y., Lu, J., Sun, P., Sun, Y., & Zhu, X. (2018). Data complexity analysis for software defect detection. International Journal of Performability Engineering, 14(8), 1695–1704. https://doi.org/10.23940/ijpe.18.08.p5.16951704
Mendeley helps you to discover research relevant for your work.