While the current literature typically focuses on load-balancing among multiple servers, in this paper, we demonstrate the importance of load-balancing within a single machine (potentially with hundreds of CPU cores). In this context, we propose a new load-balancing technique (RSS++) that dynamically modifies the receive side scaling (RSS) indirection table to spread the load across the CPU cores in a more optimal way. RSS++ incurs up to 14x lower 95th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. RSS++ allows higher CPU utilization and dynamic scaling of the number of allocated CPU cores to accommodate the input load while avoiding the typical 25% over-provisioning. RSS++ has been implemented for both (i) DPDK and (ii) the Linux kernel. Additionally, we implement a new state migration technique which facilitates sharding and reduces contention between CPU cores accessing per-flow data. RSS++ keeps the flow-state by groups that can be migrated at once, leading to a 20% higher efficiency than a state of the art shared flow table.
CITATION STYLE
Barbette, T., Katsikas, G. P., Maguire, G. Q., & Kostić, D. (2019). RSS++: Load and state-aware receive side scaling. In CoNEXT 2019 - Proceedings of the 15th International Conference on Emerging Networking Experiments and Technologies (pp. 318–333). Association for Computing Machinery, Inc. https://doi.org/10.1145/3359989.3365412
Mendeley helps you to discover research relevant for your work.