Managing Randomness to Enable Reproducible Machine Learning

Hana Ahmed; Jay Lofstead

Conference ProceedingsOPEN ACCESS

Managing Randomness to Enable Reproducible Machine Learning

P-RECS 2022 - Proceedings of the 5th International Workshop on Practical Reproducible Evaluation of Computer Systems, co-located with HPDC 2022 (2022) 15-20

DOI: 10.1145/3526062.3536353

9Citations

9Readers

Get full text

Abstract

The National Information Standards Organization defines scientific reproducibility as "obtaining consistent results using the same input data, computational steps, methods, and code, and conditions of analysis'' [12] reproducibility. Reproducibility in machine learning (ML) refers to the ability to regenerate an ML model precisely guaranteeing identical accuracy and transparency. While a model may offer reproducible inference, reproducing the model itself is frequently problematic at best due to the presence of pseudo-random numbers as part of the model generation. One way to ensure that models are trustworthy is by managing the random numbers produced during model training. This paper establishes examples of the impact of randomness in model generation and offers a preliminary investigation into how random number generation can be controlled to make ML models more reproducible.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Ahmed, H., & Lofstead, J. (2022). Managing Randomness to Enable Reproducible Machine Learning. In P-RECS 2022 - Proceedings of the 5th International Workshop on Practical Reproducible Evaluation of Computer Systems, co-located with HPDC 2022 (pp. 15–20). Association for Computing Machinery, Inc. https://doi.org/10.1145/3526062.3536353

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 1

100%

Readers' Discipline

Energy 2

100%

Managing Randomness to Enable Reproducible Machine Learning

Abstract

Author supplied keywords

References Powered by Scopus

Deep residual learning for image recognition

Why most published research findings are false

Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator

Cited by Powered by Scopus

Supervised learning applied to classifying fallers versus non-fallers among older adults with cancer

Ensuring AI for Science is Science: Making Randomness Portable

Measuring Reproduciblity of Machine Learning Methods for Medical Diagnosis

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline