Finding frequent items in data streams

671Citations
Citations of this article
239Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a 1-pass algorithm for estimating the most frequent items in a data stream using very limited storage space. Our method relies on a novel data structure called a count sketch, which allows us to estimate the frequencies of all the items in the stream. Our algorithm achieves better space bounds than the previous best known algorithms for this problem for many natural distributions on the item frequencies. In addition, our algorithm leads directly to a 2-pass algorithm for the problem of estimating the items with the largest (absolute) change in frequency between two data streams. To our knowledge, this problem has not been previously studied in the literature. © 2002 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Charikar, M., Chen, K., & Farach-Colton, M. (2002). Finding frequent items in data streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2380 LNCS, pp. 693–703). Springer Verlag. https://doi.org/10.1007/3-540-45465-9_59

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free