Using Machine Learning Ensemble Methods to Predict Execution Time of e-Science Workflows in Heterogeneous Distributed Systems

23Citations
Citations of this article
41Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Effective planning and optimized execution of the e-Science workflows in distributed systems, such as the Grid, need predictions of execution times of the workflows. However, predicting the execution times of e-Science workflows in heterogeneous distributed systems is a challenging job due to the complex structure of workflows, variations due to input problem-sizes, and heterogeneous and dynamic nature of the shared resources. To this end, we propose two novel workflow execution time-prediction methods based on the machine learning ensemble models. In this paper, we showcase our approach for two different real Grid environments. Our approach can effectively predict the execution time of the scientific workflow applications in the Grid for various problem sizes, Grid sites, and runtime environments. We characterized the workflow performance in the Grid using the attributes that define structure of workflow as well as the execution environment. Contrary to common ensembles, our ensemble systems employed three strong learners, which balance the weaknesses of each other by their strengths to model the workflow execution times. The proposed methods have been thoroughly evaluated for three real-world e-science workflow applications. The experimental results demonstrated that our proposed multi-model ensemble models can significantly decrease the prediction error (by 50%, on average) as compared with methods based on the radial basis function neural network, local learning, and performance templates. The proposed methods can also be applied with similar effectiveness and without any major modification for other heterogeneous distributed environments, such as the Cloud.

Cite

CITATION STYLE

APA

Nadeem, F., Alghazzawi, D., Mashat, A., Faqeeh, K., & Almalaise, A. (2019). Using Machine Learning Ensemble Methods to Predict Execution Time of e-Science Workflows in Heterogeneous Distributed Systems. IEEE Access, 7, 25138–25149. https://doi.org/10.1109/ACCESS.2019.2899985

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free