Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning

Yuewen Wu; Heng Wu; Yuanjia Xu; Yi Hu; Wenbo Zhang; Hua Zhong; Tao Huang

Conference ProceedingsOPEN ACCESS

Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning

Wu Y
Wu H
Xu Y
et al.

ACM International Conference Proceeding Series (2021)

DOI: 10.1145/3472456.3472488

2Citations

11Readers

Get full text

Abstract

Cloud providers are presented with a bewildering choice of VM types for a range of contemporary data processing frameworks today. However, existing performance modeling and machine learning efforts cannot pick optimal VM types for multiple frameworks simultaneously, since they are difficult to balance model accuracy and model training cost. We propose Vesta, a novel transfer learning approach, to address this challenge: (1) it abstracts knowledge of VM type selection through offline benchmarking on multiple frameworks; (2) it employs a two-layer bipartite graph to represent knowledge across frameworks; (3) it minimizes training overhead by reus-ing the knowledge to select the best VM type for given applications. Comparing with state-of-the-art efforts, our experiments on 30 applications of Hadoop, Hive and Spark show that Vesta can improve application performance up to 51% while reducing 85% training overhead.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, Y., Wu, H., Xu, Y., Hu, Y., Zhang, W., Zhong, H., & Huang, T. (2021). Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3472456.3472488

Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions