Finding a duplicate and a missing item in a stream

5Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We consider the following problem in a stream model: Given a sequence a = (a1, a2, . . ., am) wich each ai ∈ [n] = {1, . . ., n} and m > n, find a duplicate in the sequence, i.e., find some d = ai = al with i ≠ l by using limited s bits of memory and r passes over the input sequence. In one pass an algorithm reads the input sequence a in the order a1, a2, . . ., am. Since m > n, a duplicate exists by the pigeon-hole principle. Muthukrishnan [Mu05a], [Mu05b] has posed the following question for the case where m = n+1: For s = O(log n), is there a solution with a constant number of passes? We have described the problem generalizing Muthukrishnan's question by taking the sequence length m as a parameter. We give a negative answer to the original question by showing the following: Assume that m = n + 1. A streaming algorithm with O(log n) space requires Ω(log n/ log log n) passes; a k-pass streaming algorithm requires Ω(n1/(2k-1) space. We also consider the following problem of finding a missing item: Assuming that n < m, find x ∈ [m] such that x ≠ aj for 1 < j < n. The same lower bound applies for the missing-item finding problem. The proof is a simple reduction to the communication complexity of a relation. We also consider one-pass algorithms and exactly determine the minimum space required. Interesting open questions such as the following remain. For the number of passes of algorithms using O(log n) space, show an ωw(1) lower bound (or an O(1) upper bound) for: (1) duplicate finding for m = 2n, (2) missing-item finding for m = 2n, and (3) the case where we allow Las-Vegas type randomization for m = n + 1. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Tarui, J. (2007). Finding a duplicate and a missing item in a stream. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4484 LNCS, pp. 128–135). Springer Verlag. https://doi.org/10.1007/978-3-540-72504-6_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free