hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bichonfrise74 <bichonfris...@gmail.com>
Subject Re: Generic Performance Tuning of MapReduce
Date Tue, 16 Nov 2010 20:19:31 GMT
Thank you for the documentation. It really helped.

I have this setup:

5 nodes (1 master, 4 slaves) each with 4 CPU (Xeon 2.4 GHz) and 4 GB memory.

Based on the documentations that were provided, it looks like I can set the
following parameters.

mapred-site.xml,

mapred.job.reuse.jvm.num.tasks = 5 (no basis, I am just increasing it)
mapreduce.jobtracker.handler.count = 32 (no basis, I am just increasing it)
mapred.tasktracker.map.tasks.maximum = 4
mapred.tasktracker.reduce.tasks.maximum = 4
mapreduce.task.io.sort.factor = 100
mapreduce.map.output.compress = true
mapreduce.compress.map.output = true

hdfs-site.xml,

dfs.namenode.handler.count = 64
dfs.block.size = 128

Any comments on the above parameters? How do you know if the above
parameters are improving the mapreduce job? Will it be enough if I just
based it on the elapsed time that it takes for the job to finish?

Thanks.



On Tue, Nov 16, 2010 at 9:13 AM, Sanjay Sharma
<sanjay.sharma@impetus.co.in>wrote:

> You could look at one of the old papers here-
> http://code.google.com/p/hadoop-toolkit/downloads/detail?name=White%20paper-HadoopPerformanceTuning.pdf&can=2&q=
>
>
> Regards,
> Sanjay Sharma
>
>
> -----Original Message-----
> From: bichonfrise74 [mailto:bichonfrise74@gmail.com]
> Sent: Tuesday, November 16, 2010 1:07 AM
> To: common-user@hadoop.apache.org
> Subject: Generic Performance Tuning of MapReduce
>
> I have been looking around on some configuration parameters to improve the
> performance of MapReduce.
>
> Basically, I'm looking at the mapred-site.xml and so far I have set the
> following values:
>
> mapred.tasktracker.map.tasks.maximum = 40
> mapred.tasktracker.reduce.tasks.maximum = 8
> mapred.child.java.opts = -Xmx300m
>
> Are there any generic values that I can placed inside the mapred-site.xml
> to
> improve the overall performance of MapReduce.
>
> Thanks.
>
> Impetus is a proud sponsor for ASCI Tour 2010 (Agile Software Community of
> India) on Oct 30 in Noida, India.
>
> Meet Impetus at the Cloud Computing Expo from Nov 1-4 in Santa Clara. Our
> Sr. Director of Engineering, Vineet Tyagi will be speaking about ‘Using
> Hadoop for Deriving Intelligence from Large Data’.
>
> Click http://www.impetus.com/ to know more. Follow us on
> www.twitter.com/impetuscalling
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message