hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cogan, Peter (Peter)" <Peter.Co...@alcatel-lucent.com>
Subject Re: Set the number of maps
Date Thu, 01 Nov 2012 21:30:14 GMT

Thanks for your answers!

From: Marcos Ortiz <mlortiz@uci.cu<mailto:mlortiz@uci.cu>>
Date: Thu, 1 Nov 2012 19:03:23 +0100
To: peter cogan <peter.cogan@alcatel-lucent.com<mailto:peter.cogan@alcatel-lucent.com>>
Cc: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: Set the number of maps

The option since 0.21 was renamed to mapreduce.tasktracker.map.tasks.maximum, and like
Harsh said to you, is is a TaskTracker service level option.

Another thing is that this option is very united to the mapreduce.child.java.opts, so , make
to monitor constantly the effect of these changes in your cluster.

On 11/01/2012 11:55 AM, Harsh J wrote:

It can't be set from the code this way - the slot property is applied
at the TaskTracker service level (as the name goes).

Since you're just testing at the moment, try to set these values,
restart TTs, and run your jobs again. You do not need to restart JT at
any point for tweaking these values.

On Thu, Nov 1, 2012 at 7:13 PM, Cogan, Peter (Peter)
<Peter.Cogan@alcatel-lucent.com><mailto:Peter.Cogan@alcatel-lucent.com> wrote:


I understand that the maximum number of concurrent map tasks is set by
mapred.tasktracker.map.tasks.maximum  - however I wish to run with a smaller
number of maps (am testing disk IO). I thought that I could set that withinthe main program

conf.set("mapred.tasktracker.map.tasks.maximuma", "4");

to run with 4 maps – but that seems to have no impact. I know I could just
change the mapred-site.xml and restart map reduce but that's kind of a pain.
Can it be set from within the code?




Marcos Luis Ortíz Valmaseda



View raw message