Sample-based scheduler design has become an emerging research topic for its high scalability and simple scheduling process in today's big data cluster. One major limitation of such design is its lack of global cluster knowledge, which leads to sub-optimal decisions. Some cutting edge schedulers solve this issue by deploying an extra centralized component in the cluster to capture the real-Time cluster state and inform all schedulers. However, such solution is with high cost and low scalability. As an alternative, we introduce the Collaborated-Cluster State(CCS) technique in this paper. CCS is a low cost solution that merely harms the scalability of sample-based design, while achieving similar performance gain as ECC. Experiments with Google and Yahoo production trace both show that CCS under most scenarios can keep up with ECC's performance while reducing 87.7% (in Google trace) and 73.9% (in Yahoo trace) of communications.
CITATION STYLE
Hao, C., Chen, C., Shen, J., Li, M., & Boehm, B. (2017). Enhancing sample-based scheduler with collaborate-state in big data cluster. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (pp. 477–480). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2017-136
Mendeley helps you to discover research relevant for your work.