DCS: A Policy Framework for the Detection of Correlated Data Streams

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

There is an increasing demand for real-time analysis of large volumes of data streams that are produced at high velocity. The most recent data needs to be processed within a specified delay target in order for the analysis to lead to actionable result. To this end, in this paper, we present an effective solution for detecting the correlation of such data streams within a micro-batch of a fixed time interval. Our solution, coined DCS, for Detection of Correlated Data Streams, combines (1) incremental sliding-window computation of aggregates, to avoid unnecessary re-computations, (2) intelligent scheduling of computation steps and operations, driven by a utility function within a micro-batch, and (3) an exploration policy that tunes the utility function. Specifically, we propose nine policies that explore correlated pairs of live data streams across consecutive micro-batches. Our experimental evaluation on a real world dataset shows that some policies are more suitable to identifying high numbers of correlated pairs of live data streams, already known from previous micro-batches, while others are more suitable to identifying previously unseen pairs of live data streams across consecutive micro-batches.

Cite

CITATION STYLE

APA

Alseghayer, R., Petrov, D., Chrysanthis, P. K., Sharaf, M., & Labrinidis, A. (2019). DCS: A Policy Framework for the Detection of Correlated Data Streams. In Lecture Notes in Business Information Processing (Vol. 337, pp. 191–210). Springer. https://doi.org/10.1007/978-3-030-24124-7_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free