The out-of-core KNN awakens: The light side of computation force on large datasets

Nitin Chiluka; Anne Marie Kermarrec; Javier Olivares

Conference Proceedings

The out-of-core KNN awakens: The light side of computation force on large datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9944 LNCS 295-310

DOI: 10.1007/978-3-319-46140-3_24

1Citations

4Readers

Get full text

Abstract

K-Nearest Neighbors (KNN) is a crucial tool for many applications, e.g. recommender systems, image classification and web-related applications. However, KNN is a resource greedy operation particularly for large datasets. We focus on the challenge of KNN computation over large datasets on a single commodity PC with limited memory. We propose a novel approach to compute KNN on large datasets by leveraging both disk and main memory efficiently. The main rationale of our approach is to minimize random accesses to disk, maximize sequential accesses to data and efficient usage of only the available memory. We evaluate our approach on large datasets, in terms of performance and memory consumption. The evaluation shows that our approach requires only 7% of the time needed by an in-memory baseline to compute a KNN graph.

Author supplied keywords

Cite

CITATION STYLE

APA

Chiluka, N., Kermarrec, A. M., & Olivares, J. (2016). The out-of-core KNN awakens: The light side of computation force on large datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9944 LNCS, pp. 295–310). Springer Verlag. https://doi.org/10.1007/978-3-319-46140-3_24

The out-of-core KNN awakens: The light side of computation force on large datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions