Scientific communities are motivated to schedule their large-scale data analysis workflows in heterogeneous cluster environments because of privacy and financial issues. In such environments containing considerably diverse resources, efficient resource allocation approaches are essential for reaching high performance. Accordingly, this research addresses the scheduling problem of workflows with bag-of-task form to minimize total runtime (makespan). To this aim, we develop a mixed-integer linear programming model (MILP). The proposed model contains binary decision variables determining which tasks should be assigned to which nodes. Also, it contains linear constraints to fulfill the tasks requirements such as memory and scheduling policy. Comparative results show that our approach outperforms related approaches in most cases. As part of the post-optimality analysis, some secondary preferences are imposed on the proposed model to obtain the most preferred optimal solution. We analyze the relaxation of the makespan in the hope of significantly reducing the number of consumed nodes.
CITATION STYLE
Mohammadi, S., PourKarimi, L., Droop, F., De Mecquenem, N., Leser, U., & Reinert, K. (2023). A mathematical programming approach for resource allocation of data analysis workflows on heterogeneous clusters. Journal of Supercomputing, 79(17), 19019–19048. https://doi.org/10.1007/s11227-023-05325-w
Mendeley helps you to discover research relevant for your work.