hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seunghwa Kang <s.k...@gatech.edu>
Subject Re: Data-local map tasks lower than Launched map tasks even with full replication
Date Fri, 17 Jul 2009 23:01:44 GMT
I found I forgot to mention my hadoop version.

I am using 0.19.1.

Thanks again,

-seunghwa

On Fri, 2009-07-17 at 18:57 -0400, Seunghwa Kang wrote:
> Hello,
> 
> I am running Hadoop on my 4 nodes system.
> 
> Initially, I pick the replication factor of 2, and nearly 100% of map
> tasks run in local up to 3 nodes, but the ratio drops to 80% if I use
> all 4 nodes.
> 
> As my nodes have quite high I/O bandwidth (24 disks per node), but
> only
> limited network bandwidth (1 GigE), this really hampers the
> scalability.
> 
> Just for test purpose, I increase the replication factor to 4, and
> check
> that input data actually has replication factor of 4 with 'hadoop fs
> -stat %r%n' but find that the ratio is still around 80% for 4 nodes. 
> 


Mime
View raw message