Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning

2Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Cloud providers are presented with a bewildering choice of VM types for a range of contemporary data processing frameworks today. However, existing performance modeling and machine learning efforts cannot pick optimal VM types for multiple frameworks simultaneously, since they are difficult to balance model accuracy and model training cost. We propose Vesta, a novel transfer learning approach, to address this challenge: (1) it abstracts knowledge of VM type selection through offline benchmarking on multiple frameworks; (2) it employs a two-layer bipartite graph to represent knowledge across frameworks; (3) it minimizes training overhead by reus-ing the knowledge to select the best VM type for given applications. Comparing with state-of-the-art efforts, our experiments on 30 applications of Hadoop, Hive and Spark show that Vesta can improve application performance up to 51% while reducing 85% training overhead.

Cite

CITATION STYLE

APA

Wu, Y., Wu, H., Xu, Y., Hu, Y., Zhang, W., Zhong, H., & Huang, T. (2021). Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3472456.3472488

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free