Automated accelerator generation and optimization with composable, parallel and pipeline architecture

40Citations
Citations of this article
49Readers
Mendeley users who have this article in their library.

Abstract

CPU-FPGA heterogeneous architectures feature flexible acceleration of many workloads to advance computational capabilities and energy efficiency in today's datacenters. This advantage, however, is often overshadowed by the poor programmability of FPGAs. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. In this paper we propose the composable, parallel and pipeline (CPP) microarchitecture as an accelerator design template to substantially reduce the design space. Also, by introducing the CPP analytical model to capture the performanceresource trade-offs, we achieve efficient, analytical-based design space exploration. Furthermore, we develop the AutoAccel framework to automate the entire accelerator generation process. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels.

Cite

CITATION STYLE

APA

Cong, J., Wei, P., Yu, C. H., & Zhang, P. (2018). Automated accelerator generation and optimization with composable, parallel and pipeline architecture. In Proceedings - Design Automation Conference (Vol. Part F137710). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3195970.3195999

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free