A buffering method for parallelized loop with non-uniform dependencies in high-level synthesis

Akihiro Suda; Hideki Takase; Kazuyoshi Takagi; Naofumi Takagi

Conference Proceedings

A buffering method for parallelized loop with non-uniform dependencies in high-level synthesis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8285 LNCS(PART 1) 390-401

DOI: 10.1007/978-3-319-03859-9_34

1Citations

2Readers

Get full text

Abstract

Recently, polyhedral optimization has become focused as a parallelization method for nested loop kernels. However, access conflicts to an off-chip RAM have been the performance bottleneck when applying polyhedral optimization to high-level synthesis. In this paper, we propose a method to accelerate synthesized circuits by buffering off-chip RAM accesses. The buffers are constructed of on-chip RAM blocks that are placed on each of processing elements (PEs) and can be accessed in less cycles than the off-chip RAM. Our method differs from related works in support for non-uniform data dependencies that cannot be represented by constant vectors. The experimental result with practical kernels shows that the buffered circuits with 8 PEs are on average 5.21 times faster than the original ones. © Springer International Publishing Switzerland 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Suda, A., Takase, H., Takagi, K., & Takagi, N. (2013). A buffering method for parallelized loop with non-uniform dependencies in high-level synthesis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8285 LNCS, pp. 390–401). https://doi.org/10.1007/978-3-319-03859-9_34

A buffering method for parallelized loop with non-uniform dependencies in high-level synthesis

Abstract

Author supplied keywords

Cite

Register to see more suggestions