Abstract
Cloud providers are presented with a bewildering choice of VM types for a range of contemporary data processing frameworks today. However, existing performance modeling and machine learning efforts cannot pick optimal VM types for multiple frameworks simultaneously, since they are difficult to balance model accuracy and model training cost. We propose Vesta, a novel transfer learning approach, to address this challenge: (1) it abstracts knowledge of VM type selection through offline benchmarking on multiple frameworks; (2) it employs a two-layer bipartite graph to represent knowledge across frameworks; (3) it minimizes training overhead by reus-ing the knowledge to select the best VM type for given applications. Comparing with state-of-the-art efforts, our experiments on 30 applications of Hadoop, Hive and Spark show that Vesta can improve application performance up to 51% while reducing 85% training overhead.
Author supplied keywords
Cite
CITATION STYLE
Wu, Y., Wu, H., Xu, Y., Hu, Y., Zhang, W., Zhong, H., & Huang, T. (2021). Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3472456.3472488
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.