Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems

63Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We describe our experience developing what we believe to be the world's first large-scale production deployments of lightwave fabrics used for both datacenter networking and machine-learning (ML) applications. Using optical circuit switches (OCSes) and optical transceivers developed in-house, we employ hardware and software codesign to integrate the fabrics into our network and computing infrastructure. Key to our design is a high degree of multiplexing enabled by new kinds of wavelength-division-multiplexing (WDM) and optical circulators that support high-bandwidth bidirectional traffic on a single strand of optical fiber. The development of the requisite OCS and optical transceiver technologies leads to a synchronous lightwave fabric that is reconfigurable, low latency, rate agnostic, and highly available. These fabrics have provided substantial benefits for long-lived traffic patterns in our datacenter networks and predictable traffic patterns in tightly-coupled machine learning clusters. We report results for a large-scale ML superpod with 4096 tensor processing unit (TPU) V4 chips that has more than one ExaFLOP of computing power. For this use case, the deployment of a lightwave fabric provides up to 3× better system availability and model-dependent performance improvements of up to 3.3× compared to a static fabric, despite constituting less than 6% of the total system cost.

Cite

CITATION STYLE

APA

Liu, H., Urata, R., Yasumura, K., Zhou, X., Bannon, R., Berger, J., … Vahdat, A. (2023). Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems. In SIGCOMM 2023 - Proceedings of the ACM SIGCOMM 2023 Conference (pp. 499–515). Association for Computing Machinery, Inc. https://doi.org/10.1145/3603269.3604836

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free