We describe a simple algorithm for approximating the empirical entropy of a stream of m values up to a multiplicative factor of (1 + ε) using a single pass, O(ε-2 log(δ-1)log m) words of space, and O(log ε-1 + log log δ-1 + log log m) processing time per item in the stream. Our algorithm is based upon a novel extension of a method introduced by Alon et al. [1999]. This improves over previous work on this problem. We show a space lower bound of ω(ε-2/ log2(ε-1)), demonstrating that our algorithm is near-optimal in terms of its dependency on ε. We show that generalizing to multiplicative-approximation of the κth-order entropy requires close to linear space for κ ≥ 1. In contrast we show that additive-approximation is possible in a single pass using only poly-logarithmic space. Lastly, we show how to compute a multiplicative approximation to the entropy of a random walk on an undirected graph. © 2010 ACM.
CITATION STYLE
Chakrabarti, A., Cormode, G., & McGregor, A. (2010). A near-optimal algorithm for estimating the entropy of a stream. ACM Transactions on Algorithms, 6(3). https://doi.org/10.1145/1798596.1798604
Mendeley helps you to discover research relevant for your work.