A wide range of applications require efficient management of sorted data on external storage. Recently, trie-based data structures have attracted much attention from the academia as a competitive alternative for the ubiquitous B-tree. In this paper, we present a novel approach for bulk loading disk-based trie structures (a.k.a. B-trie). Our algorithm sorts raw data at first and then builds the B-trie directly from the sorted data. Data in the output data structure are compacted and physically ordered, and thus efficient sequential access can be obtained. We test the proposed algorithm with both real-world and synthetic datasets. Experimental results show that our algorithm outperforms the baseline insertion method dramatically when the dataset is large enough and is almost always superior to the basic sort-and-insert algorithm. © 2014 Springer International Publishing Switzerland.
CITATION STYLE
Ma, D., & Feng, J. (2014). A Generic approach for bulk loading trie-based index structures on external storage. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8485 LNCS, pp. 55–66). Springer Verlag. https://doi.org/10.1007/978-3-319-08010-9_8
Mendeley helps you to discover research relevant for your work.