Document retrieval with one wildcard

Moshe Lewenstein; J. Ian Munro; Yakov Nekrich; Sharma V. Thankachan

Journal ArticleOPEN ACCESS

Document retrieval with one wildcard

Theoretical Computer Science (2016) 635 94-101

DOI: 10.1016/j.tcs.2016.05.024

1Citations

3Readers

Abstract

In this paper we extend several well-known document listing problems to the case when documents contain a substring that approximately matches the query pattern. We study the scenario when the query string can contain a wildcard symbol that matches any alphabet symbol; all documents that match a query pattern with one wildcard must be enumerated. We describe a linear space data structure that reports all documents containing a substring P in O(|P|+σlog⁡log⁡log⁡n+docc) time, where σ is the alphabet size and docc is the number of listed documents. We also describe a succinct solution for this problem, as well as a solution for an extension of this problem. Furthermore our approach enables us to obtain an O(nσ)-space data structure that enumerates all documents containing both a pattern P1 and a pattern P2 in the special case when P1 and P2 differ in one symbol.

Author supplied keywords

Cite

CITATION STYLE

APA

Lewenstein, M., Munro, J. I., Nekrich, Y., & Thankachan, S. V. (2016). Document retrieval with one wildcard. Theoretical Computer Science, 635, 94–101. https://doi.org/10.1016/j.tcs.2016.05.024

Document retrieval with one wildcard

Abstract

Author supplied keywords

Cite

Register to see more suggestions