hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1971) HA: Send block report from datanode to both active and standby namenodes
Date Fri, 19 Aug 2011 00:00:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087417#comment-13087417

Eli Collins commented on HDFS-1971:

The 1st paragraph assumes parallel is faster, is that true? An alternative solution is that
a DNs reports to the standby after it has reported to the primary. This should allow the cluster
to start up more quickly. The downside here of course is that you can't fail over to the standby
until after the 2nd report has finished, but that's probably not a big deal. A disadvantage
of this approach is that it lets the standby get out of sync, but we have to handle an unsychronized
standy anyway (eg if the standby is brought up after the primary or restarted while the primary
is running).

The primary could also forward block BRs to the standby but I agree that we shouldn't pursue
this approach as the implementation will be more complex and it unnecesarily restricts the
potential parallelism (though I'm not sure it is actually slower, you could potentially transmit
much less information over the network if you report from the primary to the standby). It
also makes supporting multiple standbys more dificult.

I like solution #1. Aside from the simplicity, I think preventing a scan of all the DN disks
is important otherwise restarting the standby in a busy cluster will impact DN performance.
You could also easily implement the above optimization of delaying the BR to the standby.
100M blocks seems low, eg a cluster with 4K hosts, 12 by 3TB drives/host and 256MB blocks
is ~580M total blocks. However that's still < 10MB/host so I think it's OK.

> HA: Send block report from datanode to both active and standby namenodes
> ------------------------------------------------------------------------
>                 Key: HDFS-1971
>                 URL: https://issues.apache.org/jira/browse/HDFS-1971
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, name-node
>            Reporter: Suresh Srinivas
>            Assignee: Sanjay Radia
>         Attachments: DualBlockReports.pdf
> To enable hot standby namenode, the standby node must have current information for -
namenode state (image + edits) and block location information. This jira addresses keeping
the block location information current in the standby node. To do this, the proposed solution
is to send block reports from the datanodes to both the active and the standby namenode.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message