DCMI

Mostafa Koraei; Omid Fatemi; Magnus Jahre

Journal ArticleOPEN ACCESS

DCMI

Koraei M
Fatemi O
Jahre M

ACM Transactions on Architecture and Code Optimization (2019) 16(4) 1-24

DOI: 10.1145/3352813

N/ACitations

5Readers

Abstract

Iterative Stencil Loops (ISLs) are the key kernel within a range of compute-intensive applications. To accelerate ISLs with Field Programmable Gate Arrays, it is critical to exploit parallelism (1) among elements within the same iteration and (2) across loop iterations. We propose a novel ISL acceleration scheme called Direct Computation of Multiple Iterations (DCMI) that improves upon prior work by pre-computing the effective stencil coefficients after a number of iterations at design time—resulting in accelerators that use minimal on-chip memory and avoid redundant computation. This enables DCMI to improve throughput by up to 7.7× compared to the state-of-the-art cone-based architecture.

Cite

CITATION STYLE

APA

Koraei, M., Fatemi, O., & Jahre, M. (2019). DCMI. ACM Transactions on Architecture and Code Optimization, 16(4), 1–24. https://doi.org/10.1145/3352813

DCMI

Abstract

Cite

Register to see more suggestions