Some distributed stream processing systems store their internal states (e.g., partial aggregation results) in non-volatile storage to guarantee fault tolerance, but such checkpointing has a negative effect on system performance. To solve this problem, an existing method proposed to support an approximate guarantee of fault tolerance by omitting some checkpoints based on user-specified thresholds. However, it is difficult for a user to set appropriate thresholds because it is unclear how the thresholds affect the final output. Hence, we propose a method to support approximate fault tolerance for sensor stream processing. In our method, since we use the error bounds and the confidence threshold of recovery as user-specified thresholds, a user can set these thresholds intuitively according to his/her service level agreement (SLA). Our method models the correlation between sensing data by using a multivariate gaussian distribution, and reduces backup data if we can recover such data from the partial backup data and the probabilistic model. In this paper, we focus on average, sum, max, and min queries and propose a greedy-based backup selection algorithm. We evaluate the validity and efficiency of our approach by using synthetic data. Our experimental study shows that our approach achieves both of the reduction of backup data and approximate recovery that satisfies SLA.
CITATION STYLE
Takao, D., Sugiura, K., & Ishikawa, Y. (2020). Approximate Fault Tolerance for Sensor Stream Processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12008 LNCS, pp. 55–67). Springer. https://doi.org/10.1007/978-3-030-39469-1_5
Mendeley helps you to discover research relevant for your work.