Inverted indexes are the fundamental index for information retrieval systems. Due to the correlation between terms, inverted lists in the index may have substantial overlap and hence redundancy. In this paper, we propose a new approach that reduces the size of inverted lists while retaining time-efficiency. Our solution is based on merging inverted lists that bear high overlap to each other and manage their content in the resulting condensed index. An efficient algorithm is designed to discover heavily-overlapped inverted lists and construct the condensed index for a given dataset. We demonstrate that our algorithm delivers considerable space saving while incurring little query performance overhead. © 2012 Springer-Verlag.
CITATION STYLE
Qin, J., Xiao, C., Wang, W., & Lin, X. (2012). A space-efficient indexing algorithm for Boolean query processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7651 LNCS, pp. 638–644). https://doi.org/10.1007/978-3-642-35063-4_47
Mendeley helps you to discover research relevant for your work.