The diamondtetris algorithm for maximum performance vectorized stencil computation

3Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

An algorithm from the LRnLA family, DiamondTetris, for stencil computation is constructed. It is aimed for Many-Integrated-Core processors of the Xeon Phi family. The algorithm and its implementation is described for the wave equation based simulation. Its strong points are locality, efficient use of memory hierarchy, and, most importantly, seamless vectorization. Specifically, only 1 vector rearrange operation is necessary per cell value update. The performance is estimated with the roofline model. The algorithm is implemented in code and tested on Xeon and Xeon Phi machines.

Cite

CITATION STYLE

APA

Levchenko, V., & Perepelkina, A. (2017). The diamondtetris algorithm for maximum performance vectorized stencil computation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10421 LNCS, pp. 124–135). Springer Verlag. https://doi.org/10.1007/978-3-319-62932-2_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free