Defect prediction from static code features: Current results, limitations, new approaches

Tim Menzies; Zach Milton; Burak Turhan; Bojan Cukic; Yue Jiang; Ayşe Bener

Journal Article

Defect prediction from static code features: Current results, limitations, new approaches

Automated Software Engineering (2010) 17(4) 375-407

DOI: 10.1007/s10515-010-0069-5

374Citations

289Readers

Get full text

Abstract

Building quality software is expensive and software quality assurance (QA) budgets are limited. Data miners can learn defect predictors from static code features which can be used to control QA resources; e.g. to focus on the parts of the code predicted to be more defective. Recent results show that better data mining technology is not leading to better defect predictors. We hypothesize that we have reached the limits of the standard learning goal of maximizing area under the curve (AUC) of the probability of false alarms and probability of detection "AUC(pd, pf)"; i.e. the area under the curve of a probability of false alarm versus probability of detection. Accordingly, we explore changing the standard goal. Learners that maximize "AUC(effort, pd)" find the smallest set of modules that contain the most errors. WHICH is a meta-learner framework that can be quickly customized to different goals. When customized to AUC(effort, pd), WHICH out-performs all the data mining methods studied here. More importantly, measured in terms of this new goal, certain widely used learners perform much worse than simple manual methods. Hence, we advise against the indiscriminate use of learners. Learners must be chosen and customized to the goal at hand. With the right architecture (e.g. WHICH), tuning a learner to specific local business goals can be a simple task. © Springer Science+Business Media, LLC 2010.

Author supplied keywords

Cite

CITATION STYLE

APA

Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., & Bener, A. (2010). Defect prediction from static code features: Current results, limitations, new approaches. Automated Software Engineering, 17(4), 375–407. https://doi.org/10.1007/s10515-010-0069-5

Defect prediction from static code features: Current results, limitations, new approaches

Abstract

Author supplied keywords

Cite

Register to see more suggestions