hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: mapred.map.tasks getting set, but not sure where
Date Fri, 04 Nov 2011 15:04:24 GMT
What versions of Hadoop were you running with previously, and what version are you running
with now?

--Bobby Evans

On 11/4/11 9:33 AM, "Brendan W." <bw8408@gmail.com> wrote:


In the jobs running on my cluster of 20 machines, I used to run jobs (via
"hadoop jar ...") that would spawn around 4000 map tasks.  Now when I run
the same jobs, that number is 20; and I notice that in the job
configuration, the parameter mapred.map.tasks is set to 20, whereas it
never used to be present at all in the configuration file.

Changing the input split size in the job doesn't affect this--I get the
size split I ask for, but the *number* of input splits is still capped at
20--i.e., the job isn't reading all of my data.

The mystery to me is where this parameter could be getting set.  It is not
present in the mapred-site.xml file in <hadoop home>/conf on any machine in
the cluster, and it is not being set in the job (I'm running out of the
same jar I always did; no updates).

Is there *anywhere* else this parameter could possibly be getting set?
I've stopped and restarted map-reduce on the cluster with no effect...it's
getting re-read in from somewhere, but I can't figure out where.

Thanks a lot,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message