Tuning Your MapReduce Jobs

  • Venner J
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Once you have developed your MapReduce job, you need to be able to run it at scale on your cluster. A number of factors influence how your job scales. This chapter will cover how to recognize that your job is having a problem and how to tune the scaling parameters so that your job performs optimally. First, we’ll look at tunable items. The framework provides several parameters that let you tune how your job will run on the cluster. Most of these take effect at the job level, but a few work at the cluster level. With large clusters of machines, it becomes important to have a simple monitoring framework that provides a visual indication of how the cluster is and has been performing. Having alerts delivered when a problem is developing or occurs is also essential. This chapter introduces several tools for monitoring Hadoop services. Finally, you’ll get some tips on what to do when your job isn’t performing as it should. Your jobs may be failing or running slowly. This chapter is focused on tuning jobs running on the cluster, rather than debugging the jobs themselves. Debugging is covered in the next chapter.

Cite

CITATION STYLE

APA

Venner, J. (2009). Tuning Your MapReduce Jobs. In Pro Hadoop (pp. 177–206). Apress. https://doi.org/10.1007/978-1-4302-1943-9_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free