Investigating tree family machine learning techniques for a predictive system to unveil software defects

20Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

Abstract

Software defects prediction at the initial period of the software development life cycle remains a critical and important assignment. Defect prediction and correctness leads to the assurance of the quality of software systems and has remained integral to study in the previous years. The equick forecast of imperfect or defective modules in software development can serve the development squad to use the existing assets competently and effectively to provide remarkable software products in a given short timeline. Hitherto, several researchers have industrialized defect prediction models by utilizing statistical and machine learning techniques that are operative and effective approaches to pinpoint the defective modules. Tree family machine learning techniques are well-thought-out to be one of the finest and ordinarily used supervised learning methods. In this study, different tree family machine learning techniques are employed for software defect prediction using ten benchmark datasets. These techniques include Credal Decision Tree (CDT), Cost-Sensitive Decision Forest (CS-Forest), Decision Stump (DS), Forest by Penalizing Attributes (Forest-PA), Hoeffding Tree (HT), Decision Tree (J48), Logistic Model Tree (LMT), Random Forest (RF), Random Tree (RT), and REP-Tree (REP-T). Performance of each technique is evaluated using different measures, i.e., mean absolute error (MAE), relative absolute error (RAE), root mean squared error (RMSE), root relative squared error (RRSE), specificity, precision, recall, F-measure (FM), G-measure (GM), Matthew's correlation coefficient (MCC), and accuracy. The overall outcomes of this paper suggested RF technique by producing best results in terms of reducing error rates as well as increasing accuracy on five datasets, i.e., AR3, PC1, PC2, PC3, and PC4. The average accuracy achieved by RF is 90.2238%. The comprehensive outcomes of this study can be used as a reference point for other researchers. Any assertion concerning the enhancement in prediction through any new model, technique, or framework can be benchmarked and verified.

References Powered by Scopus

The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance

4723Citations
N/AReaders
Get full text

Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance

4060Citations
N/AReaders
Get full text

A systematic literature review on fault prediction performance in software engineering

904Citations
N/AReaders
Get full text

Cited by Powered by Scopus

A systematic literature review on software defect prediction using artificial intelligence: Datasets, Data Validation Methods, Approaches, and Tools

115Citations
N/AReaders
Get full text

The impact of using biased performance metrics on software defect prediction research

45Citations
N/AReaders
Get full text

An adaptive rank aggregation-based ensemble multi-filter feature selection method in software defect prediction

20Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Naseem, R., Khan, B., Ahmad, A., Almogren, A., Jabeen, S., Hayat, B., & Shah, M. A. (2020). Investigating tree family machine learning techniques for a predictive system to unveil software defects. Complexity, 2020. https://doi.org/10.1155/2020/6688075

Readers' Seniority

Tooltip

Lecturer / Post doc 7

47%

PhD / Post grad / Masters / Doc 7

47%

Researcher 1

7%

Readers' Discipline

Tooltip

Computer Science 12

75%

Engineering 2

13%

Social Sciences 1

6%

Linguistics 1

6%

Save time finding and organizing research with Mendeley

Sign up for free