Effective concurrency testing for distributed systems

Xinhao Yuan; Junfeng Yang

Conference ProceedingsOPEN ACCESS

Effective concurrency testing for distributed systems

International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (2020) 1141-1156

DOI: 10.1145/3373376.3378484

27Citations

36Readers

Get full text

Abstract

Despite their wide deployment, distributed systems remain notoriously hard to reason about. Unexpected interleavings of concurrent operations and failures may lead to undefined behaviors and cause serious consequences. We present Morpheus, the first concurrency testing tool leveraging partial order sampling, a randomized testing method formally analyzed and empirically validated to provide strong probabilistic guarantees of error-detection, for real-world distributed systems. Morpheus introduces conflict analysis to further improve randomized testing by predicting and focusing on operations that affect the testing result. Inspired by the recent shift in building distributed systems using higher-level languages and frameworks, Morpheus targets Erlang. Evaluation on four popular distributed systems in Erlang including RabbitMQ, a message broker service, and Mnesia, a distributed database in the Erlang standard libraries, shows that Morpheus is effective: It found previously unknown errors in every system checked, 11 total, all of which are flaws in their core protocols that may cause deadlocks, unexpected crashes, or inconsistent states.

Author supplied keywords

Cite

CITATION STYLE

APA

Yuan, X., & Yang, J. (2020). Effective concurrency testing for distributed systems. In International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (pp. 1141–1156). Association for Computing Machinery. https://doi.org/10.1145/3373376.3378484

Effective concurrency testing for distributed systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions