In 2004, Google introduced the MapReduce framework as a simple and powerful programming model that enables the easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines (Dean and Ghemawa, OSDI, 2004, [20]). In particular, the implementation described in the original paper is mainly designed to achieve high performance on large clusters of commodity PCs. One of the main advantages of this approach is that it isolates the application from the details of running a distributed program, such as issues on data distribution, scheduling, and fault tolerance. In this model, the computation takes a set of key-value pairs as input and produces a set of key-value pairs as output.
CITATION STYLE
Sakr, S. (2016). General-Purpose Big Data Processing Systems. In SpringerBriefs in Computer Science (Vol. 0, pp. 15–39). Springer. https://doi.org/10.1007/978-3-319-38776-5_2
Mendeley helps you to discover research relevant for your work.