k Nearest neighbor (kNN) classification algorithm is a prediction model which is widely used for real-life applications, such as healthcare, finance, computer vision, personalization recommendation and precision marketing. The arrival of data explosion era results in the significant increase of feature dimension, which also makes for the increase of privacy concern over the available samples and unlabeled data in the applications of machine learning. In this paper, we present a secure low communication overhead kNN classification protocol that is able to deal with high-dimensional features given in real numbers. First, to deal with feature values given in real numbers, we develop a specific data conversion algorithm, which is used in the chosen fully homomorphic scheme. This conversion algorithm is generic and applicable to other algorithms that need to handle real numbers using the fully homomorphic scheme. Second, we present a privacy-preserving euclidean distance protocol (PPEDP), which works with the Euclidean distance computation between two points given in real numbers in a high-dimensional space. Then, based on the novelty PPEDP and oblivious transfer, we propose a new classification approach, efficient secure kNN classification protocol, (ESkNN) with low communication overhead, which is appropriate for a sample set with high-dimensional features and real number feature values. Moreover, we implement ESkNN in C++. Experimental results show that ESkNN is several orders of magnitude faster in performance than existing works, and scales up to 18 000 feature dimension in a memory limited environment.
CITATION STYLE
Sun, M., & Yang, R. (2020). An efficient secure k nearest neighbor classification protocol with high-dimensional features. International Journal of Intelligent Systems, 35(11), 1791–1813. https://doi.org/10.1002/int.22272
Mendeley helps you to discover research relevant for your work.