Non-empty bins with simple tabulation hashing

Anders Aamand; Mikkel Thorup

Conference ProceedingsOPEN ACCESS

Non-empty bins with simple tabulation hashing

Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (2019) 2498-2512

DOI: 10.1137/1.9781611975482.153

4Citations

17Readers

Abstract

We consider the hashing of a set X ⊆ U with |X| = m using a simple tabulation hash function h : U → [n] = {0,n − 1} and analyse the number of non-empty bins, that is, the size of h(X). We show that the expected size of h(X) matches that with fully random hashing to within low-order terms. We also provide concentration bounds. The number of non-empty bins is a fundamental measure in the balls and bins paradigm, and it is critical in applications such as Bloom filters and Filter hashing. For example, normally Bloom filters are proportioned for a desired low false-positive probability assuming fully random hashing. Our results imply that if we implement the hashing with simple tabulation, we obtain the same low false-positive probability for any possible input.

Cite

CITATION STYLE

APA

Aamand, A., & Thorup, M. (2019). Non-empty bins with simple tabulation hashing. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 2498–2512). Association for Computing Machinery. https://doi.org/10.1137/1.9781611975482.153

Non-empty bins with simple tabulation hashing

Abstract

Cite

Register to see more suggestions