Filtering duplicate items over distributed data streams

Tian Xia; Cheqing Jin; Xiaofang Zhou; Aoying Zhou

Conference Proceedings

Filtering duplicate items over distributed data streams

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3739 LNCS 779-784

DOI: 10.1007/11563952_80

5Citations

2Readers

Get full text

Abstract

In recent years many real time applications need to handle data streams. We consider the distributed environments in which remote data sources keep on collecting data from real world or from other data sources, and continuously push the data to a central stream processor. In these kinds of environments, significant communication is induced by the transmitting of rapid, high-volume and time-varying data streams. At the same time, the computing overhead at the central processor is also incurred. In this paper, we develop a novel filter approach, called DTFilter approach, for evaluating the windowed distinct queries in such a distributed system. DTFilter approach is based on the searching algorithm using a data structure of two height-balanced trees, and it avoids transmitting duplicate items in data streams, thus lots of network resources are saved. In addition, theoretical analysis of the time spent in performing the search, and of the amount of memory needed is provided. Extensive experiments also show that DTFilter approach owns high performance. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Xia, T., Jin, C., Zhou, X., & Zhou, A. (2005). Filtering duplicate items over distributed data streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3739 LNCS, pp. 779–784). Springer Verlag. https://doi.org/10.1007/11563952_80

Filtering duplicate items over distributed data streams

Abstract

Cite

Register to see more suggestions