Heterogeneous ensemble imputation for software development effort estimation

3Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Choosing the appropriate Missing Data (MD) imputation technique for a given Software development effort estimation (SDEE) technique is not a trivial task. In fact, the impact of the MD imputation on the estimation output depends on the dataset and the SDEE technique used and there is no best imputation technique in all contexts. Thus, an attractive solution is to use more than one single imputation technique and combine their results for a final imputation outcome. This concept is called ensemble imputation and can help to significantly improve the estimation accuracy. This paper develops and evaluates a heterogeneous ensemble imputation whose members were the four single imputation techniques: K-Nearest Neighbors (KNN), Expectation Maximization (EM), Support Vector Regression (SVR), and Decision Trees (DT). The impact of the ensemble imputation was evaluated and compared with those of the four single imputation techniques on the accuracy measured in terms of the standardized accuracy criterion of four SDEE techniques: Case Based Reasoning (CBR), Multi-Layers Perceptron (MLP), Support Vector Regression (SVR) and Reduced Error Pruning Tree (REPTree). The Wilcoxon statistical test was also performed in order to assess whether the results are significant. All the empirical evaluations were carried out over the six datasets, namely, ISBSG, China, COCOMO81, Desharnais, Kemerer, and Miyazaki. Results show that the use of heterogeneous ensemble-based imputation instead single imputation significantly improved the accuracy of the four SDEE techniques. Indeed, the ensemble imputation technique was ranked either first or second in all contexts.

Cite

CITATION STYLE

APA

Abnane, I., Idri, A., Hosni, M., & Abran, A. (2021). Heterogeneous ensemble imputation for software development effort estimation. In PROMISE 2021 - Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering, co-located with ESEC/FSE 2021 (pp. 1–10). Association for Computing Machinery, Inc. https://doi.org/10.1145/3475960.3475984

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free