A heuristically optimized partitioning strategy on elias-fano index

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Inverted index is the preferred data structure for various query processing in large information systems, its compression techniques have long been studied to mitigate the dichotomy between space occupancy and decompression time. During compression, partitioning posting list into blocks aligning to its clustered distribution, can effectively minimize the compressed size while keeping partitions separately accessed. Traditional partitioning strategies using fixed-sized blocks trend to be easy to implement, but their compression effectiveness is vulnerable to outliers. Recently researchers begin to apply dynamic programming to determine optimal partitions with variable-sized blocks. However, these partitioning strategies sacrifice too much compression time. In this paper, we first compare performances of existing encoders in the space-time trade-off curve, then we present a faster algorithm to heuristically compute optimal partitions for the state-of-the-art Partitioned Elias-Fano index, taking account of compression time while maintaining the same approximation guarantees. Experimental results on TREC GOV2 document collection show that our method makes a significant improvement against its original version.

Cite

CITATION STYLE

APA

Song, X., Jiang, K., & Yang, Y. (2017). A heuristically optimized partitioning strategy on elias-fano index. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 1, pp. 93–104). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-49109-7_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free