Abstract
Recent works have demonstrated the promise of using resistive random access memory (ReRAM) to perform neural network computations in memory. In particular, ReRAM-based crossbar structures can perform matrix-vector multiplication directly in the analog domain, but the resolutions of ReRAM cells and digital/analog converters limit the precisions of inputs and weights that can be directly supported. Although convolutional neural networks (CNNs) can be trained with low-precision weights and activations, previous quantization approaches are either not amenable to ReRAM-based crossbar implementations or have poor accuracies when applied to deep CNNs on complex datasets. In this article, we propose a new CNN training and implementation approach that implements weights using a trained biased number representation, which can achieve near full-precision model accuracy with as little as 2-bit weights and 2-bit activations on the CIFAR datasets. The proposed approach is compatible with a ReRAM-based crossbar implementation. We also propose an activation-side coalescing technique that combines the steps of batch normalization, nonlinear activation, and quantization into a single stage that simply performs a clipped-rounding operation. Experiments demonstrate that our approach outperforms previous low-precision number representations for VGG-11, VGG-13, and VGG-19 models on both the CIFAR-10 and CIFAR-100 datasets.
Author supplied keywords
Cite
CITATION STYLE
Wang, W., & Lin, B. (2019). Trained biased number representation for ReRAM-based neural network accelerators. ACM Journal on Emerging Technologies in Computing Systems, 15(2). https://doi.org/10.1145/3304107
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.