Todd,

<= p class=3DMsoNormal>Much better. I also had to adjust the numb= er of map tasks for the gen step. Storage was spread across 3 nodes i= nstead of 4, but a bit more playing with these parameters should do the tri= ck.

Thanks for your help.

Jeff

= ;

From: Todd Lipco= n [mailto:todd@cloudera.com]
Sent: Friday, February 25, 2011 8:0= 9 PM
To: hdfs-user@hadoop.apache.org
Cc: Jeffrey Buell<= br>Subject: Re: balancing and replication in HDFS<= /p>

When you run terasort, pass -Dmapred.reduce.tasks=3D4 and see how that g= oes for you. See this old thread for info:

http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.m= box/%3Ccbbf4b570906300617ma4505f5o2aa1b9fb87b31868@mail.gmail.com%3E

-Todd

On Fri, Feb 25, 2011 at 4:45 PM, Jeffrey Buell <jbuell@vmware.com> wrote:<= /p>

Hi Todd,

Thanks for the quick response.
<= br>I added <final>true</final> to dfs.replication, but I still = get just one output copy. Can hadoop apps overwrite the replication l= evel even with this parameter?

I tried increasing mapred.tasktracker= .reduce.tasks.maximum from 1 to 4, but that didn't make any difference: &nb= sp;output is still all on one node. I thought that parameter controls= the number of reduce tasks per node so 1 should be sufficient, is that cor= rect logic? The other reduce parameters are at their defaults. = Which parameters should I be trying?

There's no way the sort can run= efficiently if the gen is unbalanced. At 5 GB, there're plenty of bl= ocks to make the distribution even. How does HDFS decide to spread th= e storage across the nodes? Note the sizes of the 4 chunks (1.7,...,3= .2 GB) seem to be repeatable, but they land on different nodes each time te= ragen is run.

Jeff

> = -----Original Message-----
> From: Todd Lipcon [mailto:todd@cloudera.com]
> Sent: Friday, Februar= y 25, 2011 3:13 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: balancing and repli= cation in HDFS
>

&= gt; Hi Jeff,
> The output of terasort has replication level 1 by defa= ult. This is so
> it goes faster with the default settings and makes = for more impressive
> benchmark results :)
> The reason you see= it all on one machine is probably that you're
> running with one red= ucer. Try configuring your terasort to use more
> reduce tasks and yo= u should see the load (and space usage) even out.
> -Todd
>
= > On Fri, Feb 25, 2011 at 2:52 PM, Jeffrey Buell <jbuell@vmware.com>
> wrote:
> >
= > > I'm a newbie to hadoop and HDFS. I'm seeing odd behavior in= HDFS
> that I hope somebody can clear up for me. I'm running h= adoop version
> 0.20.1+169.127 from the cloudera distro on 4 identica= l nodes, each with
> 4 cpus and 100GB disk space. Replication i= s set to 2.
> >
> > I run:
> >
> > hado= op jar /usr/lib/hadoop/hadoop-*-examples.jar teragen 50000000
> tera_= in5
> >
> > This produces the expected 10GB of data on di= sk (5GB * 2 copies).
> But the data is spread very unevenly acr= oss the nodes, ranging from
> 1.7 to 3.2 GB on each node. Then = I sort the data:
> >
> > hadoop jar /usr/lib/hadoop/hadoo= p-*-examples.jar terasort tera_in5
> tera_out5
> >
> &= gt; It finishes successfully, and HDFS recognizes the right amount of
&g= t; data:
> >
> > $ hadoop fs -du /user/hadoop/
> &g= t; Found 2 items
> > 5000023410 hdfs://namd-1/user/hadoop/te= ra_in5
> > 5000170993 hdfs://namd-1/user/hadoop/tera_out5> >
> > However all the new data is on one node (apparently= randomly chosen),
> and the total disk usage is only 15GB, which mea= ns that the output data
> is not replicated. For nearly all the= elapsed time of the sort, the
> other 3 nodes are idle. Some o= f the output data is in
> dfs/data/current, but a lot is in one of 64= new subdirs
> (dfs/data/current/subdir0 through subdir63).
> &= gt;
> > Why is all this happening? Am I missing some tunable= s that make HDFS
> do the right balance and replication?
> >=
> > Thanks,
> >
> > Jeff
>
>
>= ;
> --
> Todd Lipcon
> Software Engineer, Cloudera

-- <= br>Todd Lipcon
Software Engineer, Cloudera

= --_000_2770DFCA5D5D8F4895CD0A7EE182B2F30BEFF63C69EXCHMBX5vmwar_--