A comparative study of several eof based imputation methods for long gap missing values in a single-site temporal time dependent (Ssttd) air quality (pm10) dataset

2Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

Missing values are often a major problem in many scientific fields of environmental research, leading to prediction inaccuracy and biased analysis results. This study compares the performance of existing Empirical Orthogonal Functions (EOF) based imputation methods. The EOF mean centred approach (EOF-mean) with several proposed EOF based methods, which include the EOF-median, EOF-trimmean and the newly applied Regularised Expectation-Maximisation Principal Component Analysis based method, namely R-EMPCA in estimating missing values for long gap sequence of missing values problem that exists in a Single Site Temporal Time-Dependent (SSTTD) multivariate structure air quality (PM10) data set. The study was conducted using real PM10 data set from the Klang air quality monitoring station. Performance assessment and evaluation of the methods were conducted via a simulation plan which was carried out according to four percentages (5, 10, 20 and 30) of missing values with respect to several long gap sequences (12, 24, 168 and 720) of missing points (hours). Based on several performance indicators such as RMSE, MAE, R-Square and AI, the results have shown that R-EMPCA outperformed the other methods. The results also conclude that the proposed EOF-median and EOF-trimmean have better performance than the existing EOF-mean based method in which EOF-trimmean is the best among the three. The methodology and findings of this study contribute as a solution to the problem of missing values with long gap sequences for the SSTTD data set.

Cite

CITATION STYLE

APA

Ghazali, S. M., Shaadan, N., & Idrus, Z. (2021). A comparative study of several eof based imputation methods for long gap missing values in a single-site temporal time dependent (Ssttd) air quality (pm10) dataset. Pertanika Journal of Science and Technology, 29(4), 2625–2643. https://doi.org/10.47836/PJST.29.4.21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free