Resource Managers like Apache YARN have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low-level. This flexibility comes at a high cost in terms of developer effort, as each application must repeatedly tackle the same challenges (e.g., fault-tolerance, task scheduling and coordination) and re-implement common mechanisms (e.g., caching, bulk-data transfers). This paper presents REEF, a development framework that provides a control-plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource re-use for data caching, and state management abstractions that greatly ease the development of elastic data processing work-flows on cloud platforms that support a Resource Manager service. REEF is being used to develop several commercial offerings such as the Azure Stream Analytics service. Furthermore, we demonstrate REEF development of a distributed shell application, a machine learning algorithm, and a port of the CORFU [4] system. REEF is also currently an Apache Incubator project that has attracted contributors from several instititutions.
CITATION STYLE
Weimer, M., Chen, Y., Chun, B. G., Condie, T., Curino, C., Douglas, C., … Wang, J. (2015). REEF: Retainable evaluator execution framework. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Vol. 2015-May, pp. 1343–1355). Association for Computing Machinery. https://doi.org/10.1145/2723372.2742793
Mendeley helps you to discover research relevant for your work.