The hierarchical semi-separable (HSS) matrix factorization has useful characteristics for representing low-rank operators on extreme scale computing systems. To prepare for the higher error rates anticipated with future architectures, this paper introduces new faulttolerant algorithms for HSS matrix multiplication that maintain efficient performance in the presence of high error rates. The measured runtime overhead for error checking and data preservation using the Containment Domains library is exceptionally small and encourages the use of frequent, fine-grained error checking when using algorithm based fault tolerance.
CITATION STYLE
Austin, B., Roman, E., & Li, X. (2015). Resilient matrix multiplication of hierarchical semi-separable matrices. In FTXS 2015 - Proceedings of the 2015 Workshop on Fault Tolerance for HPC at eXtreme Scale, Part of HPDC 2015 (pp. 19–26). Association for Computing Machinery, Inc. https://doi.org/10.1145/2751504.2751507
Mendeley helps you to discover research relevant for your work.