CPU-FPGA heterogeneous architectures feature flexible acceleration of many workloads to advance computational capabilities and energy efficiency in today's datacenters. This advantage, however, is often overshadowed by the poor programmability of FPGAs. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. In this paper we propose the composable, parallel and pipeline (CPP) microarchitecture as an accelerator design template to substantially reduce the design space. Also, by introducing the CPP analytical model to capture the performanceresource trade-offs, we achieve efficient, analytical-based design space exploration. Furthermore, we develop the AutoAccel framework to automate the entire accelerator generation process. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels.
CITATION STYLE
Cong, J., Wei, P., Yu, C. H., & Zhang, P. (2018). Automated accelerator generation and optimization with composable, parallel and pipeline architecture. In Proceedings - Design Automation Conference (Vol. Part F137710). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3195970.3195999
Mendeley helps you to discover research relevant for your work.