Programming Abstractions for Managing Workflows on Tiered Storage Systems

Devarshi Ghoshal; Lavanya Ramakrishnan

Journal ArticleOPEN ACCESS

Programming Abstractions for Managing Workflows on Tiered Storage Systems

Ghoshal D
Ramakrishnan L

ACM Transactions on Storage (2021) 17(4) 1-21

DOI: 10.1145/3457119

2Citations

5Readers

Abstract

Scientific workflows in High Performance Computing ( HPC ) environments are processing large amounts of data. The storage hierarchy on HPC systems is getting deeper, driven by new technologies (NVRAMs, SSDs, etc.) There is a need for new programming abstractions that allow users to seamlessly manage data at the workflow level on multi-tiered storage systems, and provide optimal workflow performance and use of storage resources. In previous work, we introduced a software architecture Managing Data on Tiered Storage for Scientific Workflows (MaDaTS ) that used a Virtual Data Space ( VDS ) abstraction to hide the complexities of the underlying storage system while allowing users to control data management strategies. In this article, we detail the data-centric programming abstractions that allow users to manage a workflow around its data on the storage layer. The programming abstractions simplify data management for scientific workflows on multi-tiered storage systems, without affecting workflow performance or storage capacity. We measure the overheads and effectiveness introduced by the programming abstractions of MaDaTS. Our results show that these abstractions can optimally use the storage capacity in lesser capacity storage tiers, and simplify data management without adding any performance overheads.

Cite

CITATION STYLE

APA

Ghoshal, D., & Ramakrishnan, L. (2021). Programming Abstractions for Managing Workflows on Tiered Storage Systems. ACM Transactions on Storage, 17(4), 1–21. https://doi.org/10.1145/3457119

Programming Abstractions for Managing Workflows on Tiered Storage Systems

Abstract

Cite

Register to see more suggestions