The cloud gateway is essential in the public cloud as the central hub of cloud traffic. We show that horizontal scaling of software gateways, once sustainable for years, is no longer future-proof facing the massive scale and rapid growth of today's cloud. The root cause is the stagnant performance of the CPU core, which is prone to be overloaded by heavy hitters as traffic growth goes far beyond Moore's law. To address this, we propose {Sailfish}, a cloud-scale multi-tenant multi-service gateway accelerated by programmable switches. The new challenge is that large forwarding tables due to multi-tenancy cannot be fit into the limited on-chip memories. To this end, we devise a multi-pronged approach with (1) hardware/software co-design for table sharing, (2) horizontal table splitting among gateway clusters, (3) pipeline-aware table compression for a single node. Compared with the x86 gateway of a similar price, Sailfish reduces latency by 95% (2μs), improves throughput by more than 20x in bps (3.2Tbps) and 71x in pps (1.8Gpps) with packet length < 256B. Sailfish has been deployed in Alibaba Cloud for more than two years. It is the first P4-based cloud gateway in the industry, of which a single cluster carries dozens of Tbps traffic, withstanding peak-hour traffic in large online shopping festivals.
CITATION STYLE
Pan, T., Yu, N., Jia, C., Pi, J., Xu, L., Qiao, Y., … Zhu, S. (2021). Sailfish: Accelerating cloud-scale multi-tenant multi-service gateways with programmable switches. In SIGCOMM 2021 - Proceedings of the ACM SIGCOMM 2021 Conference (pp. 194–206). Association for Computing Machinery, Inc. https://doi.org/10.1145/3452296.3472889
Mendeley helps you to discover research relevant for your work.