Abstract
Short-lived traffic surges, known as microbursts, can cause periods of unexpectedly high packet delay and loss on a link. Today, preventing microbursts requires deploying switches with larger packet buffers (incurring higher cost) or running the network at low utilization (sacrificing efficiency). Instead, we argue that switches should detect microbursts as they form, and take corrective action before the situation gets worse. This requires an efficient way for switches to identify the particular flows responsible for a microburst, and handle them automatically (e.g., by pacing, marking, or rerouting the packets). However, collecting fine-grained statistics about queue occupancy in real time is challenging, even with emerging programmable data planes. We present Snappy, which identifies the flows responsible for a microburst in real time. Snappy maintains multiple snapshots of the occupants of the queue over time, where each snapshot is a compact data structure that makes efficient use of data-plane memory. As each new packet arrives, Snappy updates one snapshot and also estimates the fraction of the queue occupied by the associated flow. Our simulations with data-center packet traces show that Snappy can target the flows responsible for microbursts at the sub-millisecond level.
Cite
CITATION STYLE
Chen, X., Feibish, S. L., Koral, Y., Rexford, J., & Rottenstreich, O. (2018). Catching the microburst culprits with snappy. In SelfDN 2018 - Proceedings of the 2018 Afternoon Workshop on Self-Driving Networks, Part of SIGCOMM 2018 (pp. 22–28). Association for Computing Machinery. https://doi.org/10.1145/3229584.3229586
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.