COCA filters: Co-occurrence aware bloom filters

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We propose an indexing data structure based on a novel variation of Bloom filters. Signature files have been proposed in the past as a method to index large text databases though they suffer from a high false positive error problem. In this paper we introduce COCA Filters, a new type of Bloom filters which exploits the co-occurrence probability of words in documents to reduce the false positive error. We show experimentally that by using this technique we can reduce the false positive error by up to 21.6 times for the same index size. Furthermore Bloom filters can be replaced by COCA filters wherever the co-occurrence of any two members of the universe is identifiable. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Tirdad, K., Ghodsnia, P., Munro, J. I., & López-Ortiz, A. (2011). COCA filters: Co-occurrence aware bloom filters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7024 LNCS, pp. 313–325). https://doi.org/10.1007/978-3-642-24583-1_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free