Filtering duplicate items over distributed data streams

5Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In recent years many real time applications need to handle data streams. We consider the distributed environments in which remote data sources keep on collecting data from real world or from other data sources, and continuously push the data to a central stream processor. In these kinds of environments, significant communication is induced by the transmitting of rapid, high-volume and time-varying data streams. At the same time, the computing overhead at the central processor is also incurred. In this paper, we develop a novel filter approach, called DTFilter approach, for evaluating the windowed distinct queries in such a distributed system. DTFilter approach is based on the searching algorithm using a data structure of two height-balanced trees, and it avoids transmitting duplicate items in data streams, thus lots of network resources are saved. In addition, theoretical analysis of the time spent in performing the search, and of the amount of memory needed is provided. Extensive experiments also show that DTFilter approach owns high performance. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Xia, T., Jin, C., Zhou, X., & Zhou, A. (2005). Filtering duplicate items over distributed data streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3739 LNCS, pp. 779–784). Springer Verlag. https://doi.org/10.1007/11563952_80

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free