Abstract
Processing-in-memory (PIM) based architecture shows great potential to process several emerging artificial intelligence workloads, including vision and language models. Cross-layer optimizations could bridge the gap between computing density and the available resources by reducing the computation and memory cost of the model and improving the model's robustness against non-ideal hardware effects. We first introduce several hardware-aware training methods to improve the model robustness to the PIM device's non-ideal effects, including stuck-at-fault, process variation, and thermal noise. Then, we further demonstrate a software/hardware (SW/HW) co-design methodology to efficiently process the state-of-the-art attention-based model on PIM-based architecture by performing sparsity exploration for the attention-based model and circuit-architecture co-design to support the sparse processing.
Author supplied keywords
Cite
CITATION STYLE
Yang, X., Li, S., Zheng, Q., & Chen, Y. (2023). Improving the Robustness and Efficiency of PIM-Based Architecture by SW/HW Co-Design. In Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC (pp. 618–623). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3566097.3568358
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.