Integrity protection for scientific workflow data: Motivation and initial experiences

9Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

With the continued rise of scientific computing and the enormous increases in the size of data being processed, scientists must consider whether the processes for transmitting and storing data sufficiently assure the integrity of the scientific data. When integrity is not preserved, computations can fail and result in increased computational cost due to reruns, or worse, results can be corrupted in a manner not apparent to the scientist and produce invalid science results. Technologies such as TCP checksums, encrypted transfers, checksum validation, RAID and erasure coding provide integrity assurances at different levels, but they may not scale to large data sizes and may not cover a workflow from end-to-end, leaving gaps in which data corruption can occur undetected. In this paper we explore an approach of assuring data integrity - considering either malicious or accidental corruption - for workflow executions orchestrated by the Pegasus Workflow Management System. To validate our approach, we introduce Chaos Jungle - a toolkit providing an environment for validating integrity verification mechanisms by allowing researchers to introduce a variety of integrity errors during data transfers and storage. In addition to controlled experiments with Chaos Jungle, we provide analysis of integrity errors that we encountered when running production workflows.

Cite

CITATION STYLE

APA

Rynge, M., Vahi, K., Deelman, E., Mandal, A., Baldin, I., Bhide, O., … Feltus, F. A. (2019). Integrity protection for scientific workflow data: Motivation and initial experiences. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3332186.3332222

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free