hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ceriasmex <cerias...@gmail.com>
Subject Re: Cluster Tuning
Date Thu, 07 Jul 2011 22:25:12 GMT
Eres el Esteban que conozco?



El 07/07/2011, a las 15:53, Esteban Gutierrez <esteban@cloudera.com> escribió:

> Hi Pony,
> 
> There is a good chance that your boxes are doing some heavy swapping and
> that is a killer for Hadoop.  Have you tried
> with mapred.job.reuse.jvm.num.tasks=-1 and limiting as much possible the
> heap on that boxes?
> 
> Cheers,
> Esteban.
> 
> --
> Get Hadoop!  http://www.cloudera.com/downloads/
> 
> 
> 
> On Thu, Jul 7, 2011 at 1:29 PM, Juan P. <gordoslocos@gmail.com> wrote:
> 
>> Hi guys!
>> 
>> I'd like some help fine tuning my cluster. I currently have 20 boxes
>> exactly
>> alike. Single core machines with 600MB of RAM. No chance of upgrading the
>> hardware.
>> 
>> My cluster is made out of 1 NameNode/JobTracker box and 19
>> DataNode/TaskTracker boxes.
>> 
>> All my config is default except i've set the following in my
>> mapred-site.xml
>> in an effort to try and prevent choking my boxes.
>> *<property>*
>> *      <name>mapred.tasktracker.map.tasks.maximum</name>*
>> *      <value>1</value>*
>> *  </property>*
>> 
>> I'm running a MapReduce job which reads a Proxy Server log file (2GB), maps
>> hosts to each record and then in the reduce task it accumulates the amount
>> of bytes received from each host.
>> 
>> Currently it's producing about 65000 keys
>> 
>> The hole job takes forever to complete, specially the reduce part. I've
>> tried different tuning configs by I can't bring it down under 20mins.
>> 
>> Any ideas?
>> 
>> Thanks for your help!
>> Pony
>> 

Mime
View raw message