Computational research requires versatile data and workflow man- agement tools that can easily adapt to the highly dynamic requirements of scientific investigations. Many existing tools require strict adherence to a par- ticular usage pattern, so researchers often use less robust ad hoc solutions that they find easier to adopt. The resulting data fragmentation and methodological incompatibilities significantly impede research. Our talk showcases signac, an open-source Python framework that offers highly modular and scalable solutions for this problem. Named for the Pointillist painter Paul Signac, the framework’s powerful workflow management tools enable users to construct and automate workflows that transition seamlessly from laptops to HPC clusters. Crucially, the underlying data model is completely independent of the workflow. The flexible, serverless, and schema-free signac database can be introduced into other workflows with essentially no overhead and no recourse to the signac workflow model. Additionally, the data model’s simplicity makes it easy to parse the underlying data without using signac at all. This modularity and simplicity eliminates significant barriers for consistent data management across projects, facilitating improved provenance management and data sharing with minimal overhead. Index
CITATION STYLE
Ramasubramani, V., Adorf, C., Dodd, P., Dice, B., & Glotzer, S. (2018). signac: A Python framework for data and workflow management. In Proceedings of the 17th Python in Science Conference (pp. 152–159). SciPy. https://doi.org/10.25080/majora-4af1f417-016
Mendeley helps you to discover research relevant for your work.