Loglog counting of large cardinalities (extended abstract)

199Citations
Citations of this article
55Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Using an auxiliary memory smaller than the size of this abstract, the LOGLOG algorithm makes it possible to estimate in a single pass and within a few percents the number of different words in the whole of Shakespeare's works. In general the LOGLOG algorithm makes use of m "small bytes" of auxiliary memory in order to estimate in a single pass the number of distinct elements (the "cardinality") in a file, and it does so with an accuracy that is of the order of 1/√m. The "small bytes" to be used in order to count cardinalities till Nmax comprise about loglogNmax bits, so that cardinalities well in the range of billions can be determined using one or two kilobytes of memory only. The basic version of the LOGLOG algorithm is validated by a complete analysis. An optimized version, super-LOGLOG, is also engineered and tested on real-life data. The algorithm parallelizes optimally. © Springer-Verlag 2003.

Cite

CITATION STYLE

APA

Durand, M., & Flajolet, P. (2003). Loglog counting of large cardinalities (extended abstract). Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2832, 605–617. https://doi.org/10.1007/978-3-540-39658-1_55

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free