Abstract
This article demonstrates an approach for combining general tuning techniques with the POWER8 hardware architecture through optimizing three representative stencil benchmarks. Two typical real-world applications, with kernels similar to those of the winning programs of the Gordon Bell Prize 2016 and 2017, are employed to illustrate algorithm modifcations and a combination of hardware-oriented tuning strategies with the application algorithms. This work flls the gap between hardware capability and software performance of the POWER8 processor, and provides useful guidance for optimizing stencil-based scientifc applications on POWER systems.
Cite
CITATION STYLE
Xu, J., Fu, H., Shi, W., Gan, L., Li, Y., Luk, W., & Yang, G. (2018). Performance tuning and analysis for stencil-based applications on POWER8 processor. ACM Transactions on Architecture and Code Optimization, 15(4). https://doi.org/10.1145/3264422
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.