Documenting computing environments for reproducible experiments

8Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Establishing the reproducibility of an experiment often requires repeating the experiment in its native computing environment. Containerization tools provide declarative interfaces for documenting native computing environments. Declarative documentation, however, may not precisely recreate the native computing environment because of human errors or dependency conflicts. An alternative is to trace the native computing environment during application execution. Tracing, however, does not generate declarative documentation. In this paper, we preserve the native computing environment via tracing and and automatically generate declarative documentation using trace logs. Our method distinguishes between inputs, outputs, user and system dependencies for a variety of programming languages. It then maps traced dependencies to standard package names and their versions via querying of standard package repositories. We use standard package names to generate comprehensive declarative documentation of the container.We verify the efficacy of this approach by preserving the native computing environments of several scientific projects submitted on Zenodo and GitHub, and generating their declarative documentation. We measure precision and recall by comparing with author-provided documentation. Our approach highlights overand under-documentation in scientific experiments.

Cite

CITATION STYLE

APA

Chuah, J., Deeds, M., Malik, T., Choi, Y., & Goodall, J. L. (2020). Documenting computing environments for reproducible experiments. In Advances in Parallel Computing (Vol. 36, pp. 756–765). IOS Press BV. https://doi.org/10.3233/APC200106

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free