Collection-oriented scientific workflows for integrating and analyzing biological data

Timothy McPhillips; Shawn Bowers; Bertram Ludäscher

Conference Proceedings

Collection-oriented scientific workflows for integrating and analyzing biological data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4075 LNBI 248-263

DOI: 10.1007/11799511_23

36Citations

35Readers

Get full text

Abstract

Steps in scientific workflows often generate collections of results, causing the data flowing through workflows to become increasingly nested. Because conventional workflow components (or actors) typically operate on simple or application-specific data types, additional actors often are required to manage these nested data collections. As a result, conventional workflows become increasingly complex as data becomes more nested. This paper describes a new paradigm for developing scientific workflows that transparently manages nested data collections. Collection-oriented workflows have a number of advantages over conventional approaches including simpler workflow designs (e.g., requiring fewer actors and control-flow constructs) that are invariant under changes in data nesting. Our implementation within the KEPLER scientific workflow system enables the explicit representation of collections and collection schemas, concurrent operation over collection contents via multi-level pipeline parallelism, and allows collection-aware actors to be composed readily from conventional actors. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

McPhillips, T., Bowers, S., & Ludäscher, B. (2006). Collection-oriented scientific workflows for integrating and analyzing biological data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4075 LNBI, pp. 248–263). Springer Verlag. https://doi.org/10.1007/11799511_23

Collection-oriented scientific workflows for integrating and analyzing biological data

Abstract

Cite

Register to see more suggestions