Sr3: Customizable recovery for stateful stream processing systems

2Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Modern stream processing applications need to store and update state along with their processing, and process live data streams in a timely fashion from massive and geo-distributed data sets. Since they run in a dynamic distributed environment and their workloads may change in unexpected ways, multiple stream operators can fail at the same time, causing severe state loss. However, the state-of-the-art stream processing systems are mainly designed for low-latency intradatacenter settings and do not scale well for running stream applications that contain large distributed states, suffering a significantly centralized bottleneck and high latency to recover state. They offer failure recovery mainly through three approaches: replication recovery, checkpointing recovery, and DStream-based lineage recovery, which are either slow, resource-expensive or fail to handle multiple simultaneous failures. We present SR3, a customizable state recovery framework that provides fast and scalable state recovery mechanisms for protecting large distributed states in stream processing systems. SR3 offers three recovery mechanisms - the star-structured recovery, the line-structured recovery, and the tree-structured recovery - to cater to the needs of different stream processing computation models, state sizes, and network settings. Our design adopts a decentralized architecture that partitions and replicates states by using consistent ring overlays that leverage distributed hash tables (DHTs). We show that this approach can significantly improve the scalability and flexibility of state recovery. We realize the SR3 design on a prototype integrated with the widely adopted Apache Storm framework. Large-scale experiments using real-world datasets demonstrate SR3's scalability, fast recovery, and flexibility properties.

Cite

CITATION STYLE

APA

Xu, H., Liu, P., Cruz-Diaz, S., da Silva, D., & Hu, L. (2020). Sr3: Customizable recovery for stateful stream processing systems. In Middleware 2020 - Proceedings of the 2020 21st International Middleware Conference (pp. 251–264). Association for Computing Machinery, Inc. https://doi.org/10.1145/3423211.3425681

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free