Pre-processing and data validation in IoT data streams

Philsy Baban

Conference ProceedingsOPEN ACCESS

Pre-processing and data validation in IoT data streams

Baban P

DEBS 2020 - Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems (2020) 226-229

DOI: 10.1145/3401025.3406443

2Citations

18Readers

Get full text

Abstract

In the last few years, distributed stream processing engines have been on the rise due to their crucial impacts on real-time data processing with guaranteed low latency in several application domains such as financial markets, surveillance systems, manufacturing, smart cities, etc. Stream processing engines are run-time libraries to process data streams without knowing the lower level streaming mechanics. Apache Storm, Apache Flink, Apache Spark, Kafka Streams and Hazelcast Jet are some of the popular stream processing engines. Nowadays, critical systems like energy systems, are interconnected and automated. As a result, these systems are vulnerable to cyber-attacks. In real-world applications, the sensing values come from sensor devices contains missing values, redundant data, data outliers, manipulated data, data failures, etc. Therefore, our system must be resilient to these conditions. In this paper, we present an approach to check if there is any above mentioned conditions by pre-processing data streams using a stream processing engine like Apache Flink which will be updated as a library in future. Then, the pre-processed streams are forwarded to other stream processing engines like Apache Kafka for real stream processing. As a result, data validation, data consistency and integrity for a resilient system can be accomplished before initiating the actual stream processing.

Author supplied keywords

Cite

CITATION STYLE

APA

Baban, P. (2020). Pre-processing and data validation in IoT data streams. In DEBS 2020 - Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems (pp. 226–229). Association for Computing Machinery. https://doi.org/10.1145/3401025.3406443

Pre-processing and data validation in IoT data streams

Abstract

Author supplied keywords

Cite

Register to see more suggestions