Abstract
The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, application developers and system users are now faced with the difficult task of running and tuning their applications in an environment where run-to-run performance measurements can vary by as much as a factor of two to three. In this study, we classify, quantify, and present ways to mitigate the sources of run-to-run variability on Cray XC systems with Intel Xeon Phi processors and a dragonfly interconnect. We further demonstrate that the code-tuning performance observed in a variability-mitigating environment correlates with the performance observed in production running conditions. CCS CONCEPTS • General and reference Performance; • Networks Network performance analysis; • Hardware Process, voltage and temperature variations;
Author supplied keywords
Cite
CITATION STYLE
Chunduri, S., Harms, K., Parker, S., Morozov, V., Oshin, S., Cherukuri, N., & Kumaran, K. (2017). Run-to-run Variability on Xeon Phi based Cray XC Systems. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC (Vol. 2017-November). IEEE Computer Society. https://doi.org/10.1145/3126908.3126926
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.