Enabling re-executions of parallel scientific workflows using runtime provenance data

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Capturing provenance data in scientific workflows is a key issue since it allows for reproducibility and evaluation of results. Many of these workflows generate around 100,000 tasks that execute in parallel in High Performance Computing environments, such as large clusters and clouds. SciCumulus is a workflow engine for parallel execution in clouds. Activity failure is almost inevitable in clouds where virtual machine failures are a reality rather than a possibility. We present SciMultaneous, a service architecture that manages re-executions of failed scientific workflow tasks using runtime provenance. Experimental results on clouds showed that SciMultaneous considerably increases the workflow completion and reduces the total execution time of the workflow (considering executions and re-executions) up to 11.5%, when compared to ad-hoc approaches. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Costa, F., De Oliveira, D., Ocaña, K. A. C. S., Ogasawara, E., & Mattoso, M. (2012). Enabling re-executions of parallel scientific workflows using runtime provenance data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7525 LNCS, pp. 229–232). https://doi.org/10.1007/978-3-642-34222-6_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free