Replay without recording of production bugs for service oriented applications

8Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

Abstract

Short time-to-localize and time-to-fix for production bugs is extremely important for any 24x7 service-oriented application (SOA). Debugging buggy behavior in deployed applications is hard, as it requires careful reproduction of a similar environment and workload. Prior approaches for automatically reproducing production failures do not scale to large SOA systems. Our key insight is that for many failures in SOA systems (e.g., many semantic and performance bugs), a failure can automatically be reproduced solely by relaying network packets to replicas of suspect services, an insight that we validated through a manual study of 16 real bugs across five different systems. This paper presents Parikshan, an application monitoring framework that leverages user-space virtualization and network proxy technologies to provide a sandbox “debug” environment. In this “debug” environment, developers are free to attach debuggers and analysis tools without impacting performance or correctness of the production environment. In comparison to existing monitoring solutions that can slow down production applications, Parikshan allows application monitoring at significantly lower overhead.

Author supplied keywords

Cite

CITATION STYLE

APA

Arora, N., Bell, J., Ivan, F., Kaiser, G., & Ray, B. (2018). Replay without recording of production bugs for service oriented applications. In ASE 2018 - Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (pp. 452–463). Association for Computing Machinery, Inc. https://doi.org/10.1145/3238147.3238186

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free