Stencil computations on HPC-oriented ARMv8 64-Bit multi-core processor

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The ARMv8 64-bit platform has been considered as an alternative for high performance computing (HPC). Stencil computations are a class of iterative kernels which update array elements according to a stencil. In this paper, we evaluate the performance and scalability of one ARMv8 64-bit Multi-Core Processor with 7-point 3D stencil code, and a series of optimization are devised for the stencil code. In the optimization, we mainly focus on how to parallelize the kernel and how to exploit data locality with loop tiling, also we improve the calculation of the block size in tiling. The achieved performance differs with the grid size of stencil, and the optimal performance is 24.4% of the peak DP Flops for the grid size of 643. Comparing with Intel Xeon processor, the performance of the ARMv8 64-bit processor is about 40% of that of Sandy Bridge for the stencil code with the grid size of 5123, but this ARMv8 64-bit processor shows better scalability.

Cite

CITATION STYLE

APA

Li, C., Dong, Y., & Li, K. (2015). Stencil computations on HPC-oriented ARMv8 64-Bit multi-core processor. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9530, pp. 30–43). Springer Verlag. https://doi.org/10.1007/978-3-319-27137-8_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free