Abstract
An RRAM-based computing system (RCS) provides an energy-efficient hardware implementation of vector-matrix multiplication for machine-learning hardware. However, it is vulnerable to faults due to the immature RRAM fabrication process. We propose an efficient fault tolerance method for RCS; the proposed method, referred to as extended-ABFT (X-ABFT), is inspired by algorithm-based fault tolerance (ABFT). We utilize row checksums and test-input vectors to extract signatures for fault detection and error correction. We present a solution to alleviate the overflow problem caused by the limited number of voltage levels for the test-input signals. Simulation results show that for a Hopfield classifier with faults in 5% of its RRAM cells, X-ABFT allows us to achieve nearly the same classification accuracy as in the fault-free case.
Author supplied keywords
Cite
CITATION STYLE
Liu, M., Xia, L., Wang, Y., & Chakrabarty, K. (2020). Algorithmic Fault Detection for RRAM-based Matrix Operations. ACM Transactions on Design Automation of Electronic Systems, 25(3). https://doi.org/10.1145/3386360
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.