hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Kakani (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-2259) Replication should be decoupled from heartbeat
Date Thu, 22 Nov 2007 01:08:43 GMT
Replication should be decoupled from heartbeat

                 Key: HADOOP-2259
                 URL: https://issues.apache.org/jira/browse/HADOOP-2259
             Project: Hadoop
          Issue Type: Bug
    Affects Versions: 0.15.0
         Environment: Hadoop 80 node cluster
            Reporter: Srikanth Kakani

I did a simple experiment of shooting down one node in the cluster and measure the time taken
to replicate the under-replicated blocks.

~30000 blocks were under replicated == ~400 / node  should take 200 minutes to replicate completely
given 1 minute heartbeat interval.
My findings: it took around 220 minutes, which is reasonable.

Bug: Replication is coupled with heartbeat. Heartbeat interval is based on how much a namenode
can handle. Repliaction should be based on how much a datanode can handle.

So given the default heartbeat interval of 20 seconds, we computed datanodes can handle 2
replications in that interval based on which Namenodes give 2 blocks per heartbeat to replicate.

What we propose is to keep the 20second/2blocks constant and hence a datanode coming in with
a heartbeat of 1 minute interval should be given 6 blocks to replicate per heartbeat. In this
case instead on taking 200 minutes it should take 200/3 ~1 hour to replicate the entire node.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message