Conventional AI accelerators are limited by von-Neumann bottlenecks for edge workloads. Domain-specific accelerators (often neuromorphic) solve this by applying near/in-memory computing, NoC-interconnected massive-multicore setups, and data-flow computation. This requires an effective mapping of neural networks (i.e, an assignment of network layers to cores) to balance resources/memory, computation, and NoC traffic. Here, we introduce a mapping called Snake for the predominant convolutional neural networks (CNNs). It utilizes the feed-forward nature of CNNs by folding layers to spatially adjacent cores. We achieve a total NoC bandwidth improvement of up to 3.8X for MobileNet and ResNet vs. random mappings. Furthermore, NEWROMAP is proposed that continues to optimize Snake mapping through a meta-heuristic; it also simulates the NoC traffic and can work with TensorFlow models. The communication is further optimized with up to 22.52% latency improvement vs. pure snake mapping shown in simulations.
CITATION STYLE
Joseph, J. M., Baloglu, M. S., Pan, Y., Leupers, R., & Bamberg, L. (2021). NEWROMAP: mapping CNNs to NoC-interconnected self-contained data-flow accelerators for edge-AI. In Proceedings - 2021 15th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2021 (pp. 15–20). Association for Computing Machinery, Inc. https://doi.org/10.1145/3479876.3481591
Mendeley helps you to discover research relevant for your work.