Combining HPC and big data infrastructures in large-scale post-processing of simulation data: A case study

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Advances in scientific software and computing infrastructure have enabled researchers across disciplines to simulate and model highly complex systems. At the same time, these increases in simulation duration and scale have led to significant growths in the sizes of output data, which can be as much as hundreds of gigabytes or more. While there exist solutions to assist with most standard post-simulation analytics, researchers must develop their own code to support customized analytical tasks. Given the nature of these output data, most naive in-house sequential codes end up being inefficient, and in most cases, time-consuming. In this paper, we propose a solution to this issue by transparently combining the strengths of a high-performance computing cluster and a big data infrastructure to support an end-to-end scientific workflow. More specifically, we present a case study around the design of a research computing environment at Clemson University where these two computing systems are integrated and accessible from one another. This environment allows simulation data to be automatically transferred across systems and complex analytical tasks on these data to be developed using the Hadoop/Spark frameworks. Results show that a hybrid workflow for molecular dynamics simulation can provide significant performance improvements over a traditional workflow. Furthermore, code complexity of Hadoop/Spark solutions is shown to be less than that of a traditional solution.

Cite

CITATION STYLE

APA

Li, Y., Zhang, X., Srinath, A., Getman, R. B., & Ngo, L. B. (2018). Combining HPC and big data infrastructures in large-scale post-processing of simulation data: A case study. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3219104.3229279

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free