Bringing Elastic MapReduce to Scientific Clouds

  • Riteau P
  • Keahey K
  • Morin C
  • 9

    Readers

    Mendeley users who have this article in their library.
  • N/A

    Citations

    Citations of this article.

Abstract

The MapReduce programming model, proposed by Google, offers a simple and efficient way to perform distributed computation over large data sets. The Apache Hadoop framework is a free and open-source implementation of MapReduce. To simplify the usage of Hadoop, Amazon Web Services provides Elastic MapReduce, a web service that enables users to submit MapReduce jobs. Elastic MapReduce takes care of resource provisioning, Hadoop configuration and performance tuning, data staging, fault tolerance, etc. This service drastically reduces the entry barrier to perform MapReduce computations in the cloud. However, Elastic MapReduce is limited to using Amazon EC2 resources, and requires an extra fee. In this paper, we present our work towards creating an implementation of Elastic MapReduce which is able to use resources from other clouds than Amazon EC2, such as scientific clouds. This work will also serve as a foundation for more advanced experiments, such as performing MapReduce computations over multiple distributed clouds.

Author-supplied keywords

  • Cloud Computing
  • Elastic MapReduce
  • Hadoop
  • MapReduce

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

There are no full text links

Authors

  • Pierre Riteau

  • Kate Keahey

  • Christine Morin

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free