Capstan: A vector RDA for sparsity

Alexander Rucker; Matthew Vilim; Tian Zhao; Yaqi Zhang; Raghu Prabhakar; Kunle Olukotun

Conference ProceedingsOPEN ACCESS

Capstan: A vector RDA for sparsity

Proceedings of the Annual International Symposium on Microarchitecture, MICRO (2021) 1022-1035

DOI: 10.1145/3466752.3480047

19Citations

36Readers

Get full text

Abstract

This paper proposes Capstan: a scalable, parallel-patterns-based, reconfigurable dataflow accelerator (RDA) for sparse and dense tensor applications. Instead of designing for one application, we start with common sparse data formats, each of which supports multiple applications. Using a declarative programming model, Capstan supports application-independent sparse iteration and memory primitives that can be mapped to vectorized, high-performance hardware. We optimize random-access sparse memories with configurable out-oforder execution to increase SRAM random-access throughput from 32% to 80%. For a variety of sparse applications, Capstan with DDR4 memory is 18× faster than a multi-core CPU baseline, while Capstan with HBM2 memory is 16× faster than an Nvidia V100 GPU. For sparse applications that can be mapped to Plasticine, a recent dense RDA, Capstan is 7.6× to 365× faster and only 16% larger.

Author supplied keywords

Cite

CITATION STYLE

APA

Rucker, A., Vilim, M., Zhao, T., Zhang, Y., Prabhakar, R., & Olukotun, K. (2021). Capstan: A vector RDA for sparsity. In Proceedings of the Annual International Symposium on Microarchitecture, MICRO (pp. 1022–1035). IEEE Computer Society. https://doi.org/10.1145/3466752.3480047

Capstan: A vector RDA for sparsity

Abstract

Author supplied keywords

Cite

Register to see more suggestions