Abstract
Data processing and distributed querying workloads often involve a "scatter-gather"or "partition-aggregate"architectural pattern, whereby one application queries hundreds or even thousands of workers. Network communication is often a bottleneck in this pattern, especially when the compute task at each worker is small, such as for Web queries and interactive analytics. The network bottleneck can result in low throughput, high CPU utilization, and cause job completion time to increase by orders of magnitude. To overcome these inefficiencies, we explore hardware-offload of the scatter-gather primitive, whereby a smart NIC takes on the responsibility of sending out queries and collecting responses. We show that this approach not only virtually eliminates CPU usage, but with suitable scheduling of responses, it also speeds up scatter by allowing parallel queries, and gather by preventing throughput collapse due to excessive congestion. Besides response scheduling, we use a careful design at the NIC to limit FPGA resource usage: our approach uses about 25% of on-chip logic and 33% of on-chip memory on a mid-sized FPGA, leaving enough room for implementing other functions on the smart NIC.
Author supplied keywords
Cite
CITATION STYLE
Alvarez, C., He, Z., Alonso, G., & Singla, A. (2020). Specializing the network for scatter-gather workloads. In SoCC 2020 - Proceedings of the 2020 ACM Symposium on Cloud Computing (pp. 267–280). Association for Computing Machinery, Inc. https://doi.org/10.1145/3419111.3421301
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.