CGAT-core: A python framework for building scalable, reproducible computational biology workflows.

16Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

Abstract

In the genomics era computational biologists regularly need to process, analyse and integrate large and complex biomedical datasets. Analysis inevitably involves multiple dependent steps, resulting in complex pipelines or workflows, often with several branches. Large data volumes mean that processing needs to be quick and efficient and scientific rigour requires that analysis be consistent and fully reproducible. We have developed CGAT-core, a python package for the rapid construction of complex computational workflows. CGAT-core seamlessly handles parallelisation across high performance computing clusters, integration of Conda environments, full parameterisation, database integration and logging. To illustrate our workflow framework, we present a pipeline for the analysis of RNAseq data using pseudo-alignment.

Author supplied keywords

Cite

CITATION STYLE

APA

Cribbs, A. P., Luna-Valero, S., George, C., Sudbery, I. M., Berlanga-Taylor, A. J., Sansom, S. N., … Heger, A. (2019). CGAT-core: A python framework for building scalable, reproducible computational biology workflows. F1000Research, 8. https://doi.org/10.12688/F1000RESEARCH.18674.1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free