Convolutional neural networks (CNNs) require a huge amount of off-chip DRAM access, which accounts for most of its energy consumption. Compression of feature maps can reduce the energy consumption of DRAM access. However, previous compression methods show poor compression ratio if the feature maps are either extremely sparse or dense. To improve the compression ratio efficiently, we have exploited the spatial correlation and the distribution of non-zero activations in output feature maps. In this work, we propose a grid-based run-length compression (GRLC) and have implemented a hardware for the GRLC. Compared with a previous compression method [1], GRLC reduces 11% of the DRAM access and 5% of the energy consumption on average in VGG-16, ExtractionNet and ResNet-18.
CITATION STYLE
Park, Y., Kang, Y., Kim, S., Kwon, E., & Kang, S. (2020). GRLC: Grid-based run-length compression for energy-efficient CNN accelerator. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3370748.3406576
Mendeley helps you to discover research relevant for your work.