Cascading Structured Pruning: Enabling High Data Reuse for Sparse DNN Accelerators

29Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Performance and efciency of running modern Deep Neural Networks (DNNs) are heavily bounded by data movement. To mitigate the data movement bottlenecks, recent DNN inference accelerator designs widely adopt aggressive compression techniques and sparse-skipping mechanisms. These mechanisms avoid transferring or computing with zero-valued weights or activations to save time and energy. However, such sparse-skipping logic involves large input buffers and irregular data access patterns, thus precluding many energy-efcient data reuse opportunities and dataflows. In this work, we propose Cascading Structured Pruning (CSP), a technique that preserves signifcantly more data reuse opportunities for higher energy efciency while maintaining comparable performance relative to recent sparse architectures such as SparTen. CSP includes the following two components: At algorithm level, CSP-A induces a predictable sparsity pattern that allows for low-overhead compression of weight data and sequential access to both activation and weight data. At architecture level, CSP-H leverages CSP-A's induced sparsity pattern with a novel dataflow to access unique activation data only once, thus removing the demand for large input buffers. Each CSP-H processing element (PE) employs a novel accumulation buffer design and a counter-based sparse-skipping mechanism to support the dataflow with minimum controller overhead. We verify our approach on several representative models. Our simulated results show that CSP achieves on average 15× energy efciency improvement over SparTen with comparable or superior speedup under most evaluations.

Cite

CITATION STYLE

APA

Hanson, E., Li, S., Li, H. H., & Chen, Y. (2022). Cascading Structured Pruning: Enabling High Data Reuse for Sparse DNN Accelerators. In Proceedings - International Symposium on Computer Architecture (pp. 522–535). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3470496.3527419

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free