We describe a data structure that uses O (n)-word space and reports k most relevant documents that contain a query pattern P in optimal O(|P| + k) time. Our construction supports an ample set of important relevance measures, such as the frequency of P in a document and the minimal distance between two occurrences of P in a document. We show how to reduce the space of the data structure from O(n log n) to O (n(log σ + log D + log log n)) bits, where σ is the alphabet size and D is the total number of documents. Copyright © SIAM.
CITATION STYLE
Navarro, G., & Nekrich, Y. (2012). Top-k document retrieval in optimal time and linear space. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1066–1077). Association for Computing Machinery. https://doi.org/10.1137/1.9781611973099.84
Mendeley helps you to discover research relevant for your work.