Data pre-processing is a critical task in the knowledge discovery process in order to ensure the quality of the data to be analyzed. One widely studied problem in data pre-processing is the handling of missing values with the aim to recover its original value. Based on numerous studies on missing values, it is shown that different methods are needed for different types of missing data. In this work, we propose a new method to deal with missing values in data sets where cluster properties exist among the data records. By integrating the clustering and regression techniques, the proposed method can predict the missing values with higher accuracy. To our best knowledge, this is the first work combining regression and clustering analysis to deal with the missing values problem. Through empirical evaluation, the proposed method was shown to perform better than other methods under different types of data sets. © 2003 Taylor and Francis Group, LLC.
CITATION STYLE
Tseng, S. M., Wang, K. H., & Lee, C. I. (2003). A pre-processing method to deal with missing values by integrating clustering and regression techniques. Applied Artificial Intelligence, 17(5–6), 535–544. https://doi.org/10.1080/713827170
Mendeley helps you to discover research relevant for your work.