hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Is there any Namenode block scheduler which considers the block load
Date Wed, 12 Oct 2011 04:57:39 GMT

I actually moved your mail into hdfs-dev@ earlier and also responded.
But anyway, re-posting since you perhaps did not notice it:

The transfer thread load is definitely considered when building the DN
replication pipeline. See the method
ReplicationTargetChooser#isGoodTarget(…), which is called during each
type of choice (local node, local rack, remote rack or random). One
part of its analysis is looking at the load factor, which is
determined by the thread count, and enabled by default.

On Tue, Oct 11, 2011 at 4:38 PM, AnilKumar B <akumarb2010@gmail.com> wrote:
> Scenario: If I run huge number of jobs(all these jobs will use the same
> resources(input files)) on mini cluster(say 10-15 nodes), then every time
> namenode returning the first block of nearest data node. So in this case all
> the clients are trying to do read/write operations on same block.
> So is there any other namenode scheduling which considers the block level
> overhead?

Harsh J

View raw message