Exploiting the parallel execution of homology workflow alternatives in HPC compute clouds

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Homology modeling (HM) plays an important role in drug discovery. HM analysis aims at predicting a 3D model from a biological sequence in order to discover new drugs. There are several problems in executing an HM analysis in large-scale, such as multiple software to be evaluated, the management of the parallel execution, and results analysis, e.g. browsing manually all results to find which structure was derived from which program with good quality. Scientific Workflow Management System (SWfMS) with parallelism and provenance support can aid the large-scale HM executions by addressing the result analysis. However, before submitting the HM workflow for execution, it has to be specified along with its several alternatives (also called variants), as considered in this paper. Managing HM workflow variations is a complex task to be accomplished even with the help of a SWfMS. In this paper, we propose SciSamma (Structural Approach and Molecular Modeling Analyses), an abstract representation of HM workflows inspired in the concept of software product lines (SPL). SciSamma models HM workflow variants to execute with parallel processing in the cloud using SciCumulus SWfMS. We evaluated SciSamma with two common variants using 100 protease enzymes of protozoan genomes. Both variations presented scalability with performance improvements (dropping from 8 h to 27 min using 32 Amazon’s large virtual machines). While evaluating the two workflow variants, through provenance queries, they present the same quality in biological results, but the difference in execution time between them was around 40 %.

Cite

CITATION STYLE

APA

Ocaña, K. A. C. S., de Oliveira, D., Silva, V., Benza, S., & Mattoso, M. (2015). Exploiting the parallel execution of homology workflow alternatives in HPC compute clouds. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8954, pp. 336–350). Springer Verlag. https://doi.org/10.1007/978-3-319-22885-3_29

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free