The Pearson Correlation Coefficient (PCC) and Principal Component Analysis (PCA) are methodologies commonly used for linear variable selection. PCC has been extensively used for variable selection, due to its simplicity and as it assists in recognizing the degree of correlation between input and output variables. Meanwhile, PCA has been used for recognizing variables that have high variances influencing the output variable. However, the use of linear forms of variables selection methodologies in non-linear modelling such as artificial neural networks (ANN) is questionable. In this work, the acceptability of PCC and PCA in variable selection for ANN modelling of the coagulation process in water treatment, is analysed. ANN models, aiming to predict coagulant dosage, treated water (TW) turbidity, TW pH and residual Aluminium, were developed. In order to compare the validity of inputs selected via PCC and PCA, an exhaustive search strategy of variable selection was carried out. The results showed that using the variables selected using PCA did not contribute in improving ANN model development. Meanwhile, variables selected by PCC were successfully used for all ANNs developed, except for TW pH prediction. The results also demonstrated that PCC and PCA are incapable of capturing collective effects of variables, on the output parameter.
CITATION STYLE
Jayaweera, C. D., & Aziz, N. (2018). Reliability of Principal Component Analysis and Pearson Correlation Coefficient, for Application in Artificial Neural Network Model Development, for Water Treatment Plants. In IOP Conference Series: Materials Science and Engineering (Vol. 458). Institute of Physics Publishing. https://doi.org/10.1088/1757-899X/458/1/012076
Mendeley helps you to discover research relevant for your work.