hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1975) HA: Support for sharing the namenode state from active to standby.
Date Wed, 24 Aug 2011 23:19:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090611#comment-13090611

Jitendra Nath Pandey commented on HDFS-1975:

> I think we need to add nextGenerationStamp calls in a few places.
 I agree.

> Have you enumerated the various coordinations that we need to consider?
 IMO, we need to consider synchronization with edit logs for any message that Datanode sends
to the standby, i.e. for every method in DatanodeProtocol. I think we need synchronization
in only those methods that are referring to blocks. Here is the list of all methods and my
classification based on synchronization needed or not.
# registerDatanode :     
** I think no synchronization is needed, because there is no corresponding datanode info coming
from edit logs.
# reportBadBlocks:
** Synchronization is not needed because the blocks being reported bad must have been reported
earlier in a block report or a block received message by the datanode. Therefore if the block
is not found in the block map of the standby, it only means its a deleted block.
# commitBlockSynchronization:
** Synchronization is needed for the same reason as in block received case.
# blockReport:
** Synchronization is needed because standby may not even have seen a block that is reported
in block report.
# blockReceived:
** Synchronization is needed because standby may not even have seen a block that is reported
in block received.
# sendHeartbeat :  
** No synchronization is needed with edit logs.
# errorReport:
** Standby can just ignore this?
# versionRequest:
** Standby can just ignore this?
# processUpgradeCommand:
** Ignored by Standby.
>From the list above, it seems to me that coordination is only required for block related
info received from datanode vs that received in edit logs. Therefore using generation stamp
is a good choice because all blocks have a generation stamp. Is that a valid conclusion?

Considering the txid approach, it seems it won't work. Consider following case:
Standby receives a block received message and doesn't find the block in its map. It is possible
for two reasons:
 a) the standby hasn't seen the edit log for the allocate block.
 b) the standby has seen and processed an allocate block and also a delete for that block.
The standby needs to be able to distinguish between the above two possibilities to correctly
process the block received.
Now it may be possible that the allocate and/or delete happened after the last command from
the namenode, and the  last transaction id known to the datanode is older than the allocate/delete.
Then the standby won't know how to process the received block.

> HA: Support for sharing the namenode state from active to standby.
> ------------------------------------------------------------------
>                 Key: HDFS-1975
>                 URL: https://issues.apache.org/jira/browse/HDFS-1975
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: Suresh Srinivas
>            Assignee: Jitendra Nath Pandey
>         Attachments: hdfs-1975.txt, hdfs-1975.txt
> To enable hot standby namenode, the standby node must have current information for -
namenode state (image + edits) and block location information. This jira addresses keeping
the namenode state current in the standby node. To do this, the proposed solution in this
jira is to use a shared storage to store the namenode state. 
> Note one could also build an alternative solution by augmenting the backup node. A seperate
jira could explore this.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message