Efficient bulk loading of large high-dimensional indexes

N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Efficient index construction in multidimensional data spaces is important for many knowledge discovery algorithms, because construction times typically must be amortized by performance gains in query processing. In this paper, we propose a generic bulk loading method which allows the application of user-defined split strategies in the index construction. This approach allows the adaptation of the index properties to the requirements of a specific knowledge discovery algorithm. As our algorithm takes into account that large data sets do not fit in main memory, our algorithm is based on external sorting. Decisions of the split strategy can be made according to a sample of the data set which is selected automatically. The sort algorithm is a variant of the well-known Quicksort algorithm, enhanced to work on secondary storage. The index construction has a runtime complexity of O(n log n). We show both analytically and experimentally that the algorithm outperforms traditional index construction methods by large factors.

Cite

CITATION STYLE

APA

Böhm, C., & Kriegel, H. P. (1999). Efficient bulk loading of large high-dimensional indexes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1676, pp. 251–260). Springer Verlag. https://doi.org/10.1007/3-540-48298-9_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free