An empirical validation of learning schemes using an automated genetic defect prediction framework

Juan Murillo-Morera; Carlos Castro-Herrera; Javier Arroyo; Rubén Fuentes-Fernández

Conference Proceedings

An empirical validation of learning schemes using an automated genetic defect prediction framework

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 10022 LNAI 222-234

DOI: 10.1007/978-3-319-47955-2_19

3Citations

7Readers

Get full text

Abstract

Today, it is common for software projects to collect measurement data through development processes. With these data, defect prediction software can try to estimate the defect proneness of a software module, with the objective of assisting and guiding software practitioners. With timely and accurate defect predictions, practitioners can focus their limited testing resources on higher risk areas. This paper reports a benchmarking study that uses a genetic algorithm that automatically generates and compares different learning schemes (preprocessing + attribute selection + learning algorithms). Performance of the software development defect prediction models (using AUC, Area Under the Curve) was validated using NASA-MDP and PROMISE data sets. Twelve data sets from NASA-MDP (8) and PROMISE (4) projects were analyzed running a M × N-fold cross-validation. We used a genetic algorithm to select the components of the learning schemes automatically, and to evaluate and report those with the best performance. In all, 864 learning schemes were studied. The most common learning schemes were: data preprocessors: Log and CoxBox + attribute selectors: Backward Elimination, Best First and Linear Forward Selection + learning algorithms: Naïve Bayes, Naïve Bayes Simple, Simple Logistic, Multilayer Perceptron, Logistic, Logit Boost, Bayes Net, and One R. The genetic algorithm reported steady performance and runtime among data sets, according to statistical analysis.

Author supplied keywords

Cite

CITATION STYLE

APA

Murillo-Morera, J., Castro-Herrera, C., Arroyo, J., & Fuentes-Fernández, R. (2016). An empirical validation of learning schemes using an automated genetic defect prediction framework. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10022 LNAI, pp. 222–234). Springer Verlag. https://doi.org/10.1007/978-3-319-47955-2_19

An empirical validation of learning schemes using an automated genetic defect prediction framework

Abstract

Author supplied keywords

Cite

Register to see more suggestions