Specializing the network for scatter-gather workloads

6Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data processing and distributed querying workloads often involve a "scatter-gather"or "partition-aggregate"architectural pattern, whereby one application queries hundreds or even thousands of workers. Network communication is often a bottleneck in this pattern, especially when the compute task at each worker is small, such as for Web queries and interactive analytics. The network bottleneck can result in low throughput, high CPU utilization, and cause job completion time to increase by orders of magnitude. To overcome these inefficiencies, we explore hardware-offload of the scatter-gather primitive, whereby a smart NIC takes on the responsibility of sending out queries and collecting responses. We show that this approach not only virtually eliminates CPU usage, but with suitable scheduling of responses, it also speeds up scatter by allowing parallel queries, and gather by preventing throughput collapse due to excessive congestion. Besides response scheduling, we use a careful design at the NIC to limit FPGA resource usage: our approach uses about 25% of on-chip logic and 33% of on-chip memory on a mid-sized FPGA, leaving enough room for implementing other functions on the smart NIC.

Cite

CITATION STYLE

APA

Alvarez, C., He, Z., Alonso, G., & Singla, A. (2020). Specializing the network for scatter-gather workloads. In SoCC 2020 - Proceedings of the 2020 ACM Symposium on Cloud Computing (pp. 267–280). Association for Computing Machinery, Inc. https://doi.org/10.1145/3419111.3421301

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free