hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject MR job launching is slower
Date Tue, 20 Mar 2012 10:40:23 GMT
I have 10 node cluster ( around 24 CPUs, 48 GB RAM, 1 TB HDD, 10 GB
ethernet connection)
After triggering any MR job, its taking like 3-5 seconds to launch ( I mean
the time when I can see any MR job completion % on the screen).
I know internally its trying to launch the job,intialize mappers, loading
data etc.
What I want to know - Is it a default/desired/expected hadoop behavior or
there are ways in which I can decrease this startup time ?

Also I feel like my hadoop jobs should run faster, but I am still not able
to make it as fast as it should be according to me ?
I did some tunning also, following are the parameters I am playing around
these days but still I feel there are something missing that I can still
use:

dfs.block.size:

mapred.compress.map.output

mapred.map/reduce.tasks.speculative.execution

mapred.tasktracker.map/reduce.tasks.maximum:

mapred.child.java.opts

io.sort.mb:

io.sort.factor:

mapred.reduce.parallel.copies:

mapred.job.reuse.jvm.num.tasks:


Thanks,
Praveenesh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message