Conventional K nearest neighbor classifier is challenged by data with high-dimensional and mixture of continuous and categorical variable characteristics. To address this issue, we propose an improved K nearest neighbor classifier by using principal component mixed also called PCAmix. First, we employed the PCAmix to preprocess high-dimensional and mixture data. Then, the KNN classifier is used to classify objects characterized by high-dimensional mixture of continuous and categorical variables. We have evaluated our method using five mixture datasets that are available from the UCI machine learning repository and compared our results with the base line approaches. The experimental results showed that the proposed method had better classification performance for high-dimensional mixture data.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Tuerhong, G., Wushouer, M., & Zhang, D. (2021). An Improved K Nearest Neighbor Classifier for High-Dimensional and Mixture Data. In Journal of Physics: Conference Series (Vol. 1813). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1813/1/012026