Peppa-X: Finding program test inputs to bound silent data corruption vulnerability in hpc applications

18Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Transient hardware faults have become prevalent due to the shrinking size of transistors, leading to silent data corruptions (SDCs). Therefore, HPC applications need to be evaluated (e.g., via fault injections) and protected to meet the reliability target. In the evaluation, the target programs exercise with a set of given inputs which are usually from program benchmark suite. However, these inputs rarely manifest the SDC vulnerabilities, leading to over-optimistic assessment and unexpectedly higher failure rates in production.We propose Peppa-X, which efficiently identifies the test inputs that estimate the bound of program SDC resiliency. Our key insight is that the SDC sensitivity distribution in a program often remains stationary across input space. Thereby, we can guide the search of SDC-bound inputs by a sampled distribution. Our evaluation shows that Peppa-X can identify the SDC-bound input of a program that existing methods cannot find even with 5x more search time.

Cite

CITATION STYLE

APA

Rahman, M. H., Shamji, A., Guo, S., & Li, G. (2021). Peppa-X: Finding program test inputs to bound silent data corruption vulnerability in hpc applications. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC. IEEE Computer Society. https://doi.org/10.1145/3458817.3476147

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free