The computational complexity and the constantly increasing amount of input data for scientific computing models is threatening their scalability. In addition, this is leading towards more data-intensive scientific computing, thus rising the need to combine techniques and infrastructures from the HPC and big data worlds. This paper presents a methodological approach to cloudify generalist iterative scientific workflows, with a focus on improving data locality and preserving performance. To evaluate this methodology, it was applied to an hydrological simulator, EnKF-HGS. The design was implemented using Apache Spark, and assessed in a local cluster and in Amazon Elastic Compute Cloud (EC2) against the original version to evaluate performance and scalability.
CITATION STYLE
Caíno-Lores, S., Lapin, A., Kropf, P., & Carretero, J. (2016). Methodological approach to data-centric cloudification of scientific iterative workflows. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10048 LNCS, pp. 469–482). Springer Verlag. https://doi.org/10.1007/978-3-319-49583-5_36
Mendeley helps you to discover research relevant for your work.