Generalizing streaming pipeline design for big data

Krushnaa Rengarajan; Vijay Krishna Menon

Conference Proceedings

Generalizing streaming pipeline design for big data

Advances in Intelligent Systems and Computing (2020) 1085 149-160

DOI: 10.1007/978-981-15-1366-4_12

2Citations

5Readers

Get full text

Abstract

Streaming data refers to the data that is sent to a cloud or a processing centre in real time. Even though we have limited exposure to such applications that can process data streams on a live basis and generate useful insights, we are still in infancy when it comes to complicated stream processing. Current streaming data analytics tools represents the third-/fourth-generation data processing capability in the big data hierarchy which includes the Hadoop ecosystem, Apache Storm™ and Apache Kafka™ and its likes, Apache Spark™ framework, and now, Apache Flink™ with its non-batch stateful streaming core. Each of these individually cannot handle all the aspects of a data processing pipeline, alone. It is essential to have good synergy between these technologies to cater to the management, streaming, processing and fault-tolerant requirements of various data-driven applications. Companies tailor their pipelines exclusively for their requirements, since making a general framework entails mammoth interfacing and configuration efforts that are not cost-effective for them. In this paper, we envision and implement such a generalized minimal stream processing pipeline and measure its performance, on some data sets, in the form of delays and latencies of data arrival at pivotal checkpoints in the pipeline. We virtualize this using a Docker™ container without much loss in performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Rengarajan, K., & Menon, V. K. (2020). Generalizing streaming pipeline design for big data. In Advances in Intelligent Systems and Computing (Vol. 1085, pp. 149–160). Springer. https://doi.org/10.1007/978-981-15-1366-4_12

Generalizing streaming pipeline design for big data

Abstract

Author supplied keywords

Cite

Register to see more suggestions