The Influence of Input Data Standardization Methods on the Prediction Accuracy of Genetic Programming Generated Classifiers

Amaal R. Al Shorman; Hossam Faris; Pedro A. Castillo; J. J. Merelo

Conference ProceedingsOPEN ACCESS

The Influence of Input Data Standardization Methods on the Prediction Accuracy of Genetic Programming Generated Classifiers

International Joint Conference on Computational Intelligence (2018) 1 79-85

DOI: 10.5220/0006959000790085

0Citations

11Readers

Get full text

Abstract

Genetic programming (GP) is a powerful classification technique. It is interpretable and it can dynamically build very complex expressions that maximize or minimize some fitness functions. It has a capacity to model very complex problems in the area of Machine Learning, Data Mining and Pattern Recognition. Nevertheless, GP has a high computational complexity time. On the other side, data standardization is one of the most important pre-processing steps in machine learning. The purpose of this step is to unify the scale of all input features to have equal contribution to the model. The objective of this paper is to investigate the influence of input data standardization methods on GP, and how it affects its prediction accuracy. Six different methods of input data standardization were checked in order to determine which one allows to achieve the most accurate result with lowest computational cost. The simulations have been implemented on ten benchmarked datasets with three different scenarios (varying the population size and number of generations). The results showed that the computational efficiency of GP is highly enhanced when coupled with some standardization methods, specifically Min-Max method for scenario I and Vector method for scenario II, and scenario III. Whereas, Manhattan and Z-Score methods had the worst results for all three scenarios.

Author supplied keywords

Cite

CITATION STYLE

APA

Al Shorman, A. R., Faris, H., Castillo, P. A., & Merelo, J. J. (2018). The Influence of Input Data Standardization Methods on the Prediction Accuracy of Genetic Programming Generated Classifiers. In International Joint Conference on Computational Intelligence (Vol. 1, pp. 79–85). Science and Technology Publications, Lda. https://doi.org/10.5220/0006959000790085

The Influence of Input Data Standardization Methods on the Prediction Accuracy of Genetic Programming Generated Classifiers

Abstract

Author supplied keywords

Cite

Register to see more suggestions