NEWROMAP: mapping CNNs to NoC-interconnected self-contained data-flow accelerators for edge-AI

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Conventional AI accelerators are limited by von-Neumann bottlenecks for edge workloads. Domain-specific accelerators (often neuromorphic) solve this by applying near/in-memory computing, NoC-interconnected massive-multicore setups, and data-flow computation. This requires an effective mapping of neural networks (i.e, an assignment of network layers to cores) to balance resources/memory, computation, and NoC traffic. Here, we introduce a mapping called Snake for the predominant convolutional neural networks (CNNs). It utilizes the feed-forward nature of CNNs by folding layers to spatially adjacent cores. We achieve a total NoC bandwidth improvement of up to 3.8X for MobileNet and ResNet vs. random mappings. Furthermore, NEWROMAP is proposed that continues to optimize Snake mapping through a meta-heuristic; it also simulates the NoC traffic and can work with TensorFlow models. The communication is further optimized with up to 22.52% latency improvement vs. pure snake mapping shown in simulations.

Cite

CITATION STYLE

APA

Joseph, J. M., Baloglu, M. S., Pan, Y., Leupers, R., & Bamberg, L. (2021). NEWROMAP: mapping CNNs to NoC-interconnected self-contained data-flow accelerators for edge-AI. In Proceedings - 2021 15th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2021 (pp. 15–20). Association for Computing Machinery, Inc. https://doi.org/10.1145/3479876.3481591

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free