hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Esteban Gutierrez <este...@cloudera.com>
Subject Re: Cluster Tuning
Date Thu, 07 Jul 2011 20:53:07 GMT
Hi Pony,

There is a good chance that your boxes are doing some heavy swapping and
that is a killer for Hadoop.  Have you tried
with mapred.job.reuse.jvm.num.tasks=-1 and limiting as much possible the
heap on that boxes?

Cheers,
Esteban.

--
Get Hadoop!  http://www.cloudera.com/downloads/



On Thu, Jul 7, 2011 at 1:29 PM, Juan P. <gordoslocos@gmail.com> wrote:

> Hi guys!
>
> I'd like some help fine tuning my cluster. I currently have 20 boxes
> exactly
> alike. Single core machines with 600MB of RAM. No chance of upgrading the
> hardware.
>
> My cluster is made out of 1 NameNode/JobTracker box and 19
> DataNode/TaskTracker boxes.
>
> All my config is default except i've set the following in my
> mapred-site.xml
> in an effort to try and prevent choking my boxes.
>  *<property>*
> *      <name>mapred.tasktracker.map.tasks.maximum</name>*
> *      <value>1</value>*
> *  </property>*
>
> I'm running a MapReduce job which reads a Proxy Server log file (2GB), maps
> hosts to each record and then in the reduce task it accumulates the amount
> of bytes received from each host.
>
> Currently it's producing about 65000 keys
>
> The hole job takes forever to complete, specially the reduce part. I've
> tried different tuning configs by I can't bring it down under 20mins.
>
> Any ideas?
>
> Thanks for your help!
> Pony
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message