Reproducible and accurate matrix multiplication

Roman Iakymchuk; David Defour; Sylvain Collange; Stef Graillat

Conference ProceedingsOPEN ACCESS

Reproducible and accurate matrix multiplication

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9553 126-137

DOI: 10.1007/978-3-319-31769-4_11

7Citations

4Readers

Abstract

Due to non-associativity of floating-point operations and dynamic scheduling on parallel architectures, getting a bit-wise reproducible floating-point result for multiple executions of the same code on different or even similar parallel architectures is challenging. In this paper, we address the problem of reproducibility in the context of matrix multiplication and propose an algorithm that yields both reproducible and accurate results. This algorithm is composed of two main stages: a filtering stage that uses fast vectorized floating-point expansions in conjunction with error-free transformations; an accumulation stage based on Kulisch long accumulators in a high-radix carry-save representation. Finally, we provide implementations and performance results in parallel environments like GPUs.

Author supplied keywords

Cite

CITATION STYLE

APA

Iakymchuk, R., Defour, D., Collange, S., & Graillat, S. (2016). Reproducible and accurate matrix multiplication. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9553, pp. 126–137). Springer Verlag. https://doi.org/10.1007/978-3-319-31769-4_11

Reproducible and accurate matrix multiplication

Abstract

Author supplied keywords

Cite

Register to see more suggestions