hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN
Date Tue, 30 Aug 2011 09:19:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093571#comment-13093571

Konstantin Shvachko commented on HDFS-1623:

The discussion in HDFS-1108 revealed that Todd, Suresh and Eli (and probably others) are building
HA approach based on shared storage (NFS filers) journal synchronization. The motivation for
this is claimed to be the simplicity of the approach, compared to the direct streaming of
edits to the StandbyNode. I think there are 2 main questions that need to be addressed with
respect to this:
# _Why do you introduce a dependency on enterprise hardware when you run a commodity hardware
*People running a 20-node Hadoop cluster will have to spend probably the same amount extra
on a filer.*
# _How do you address the race condition between NN addBlock and DN blockReceived?_
Explanation: When HDFS client needs to creates a new block it sends addBlock() command to
the NameNode. NN (assuming HDFS-1108 is fixed) writes addBlock transaction to the shared storage.
The client writes data to the allocated DataNodes. Each DataNode confirms that it got the
replica by sending blockReceived() message to NN and SBN. If blockReceived() is sent to StandbyNode
before it consumed addBlock() transaction for this block from shared storage, blockReceived()
will be rejected since SBN still does not know the block exists. SBN will eventually learn
about that same replica from the next block report, but this can be one hour later. 
*SBN will be one hour behind the active NN, which is not hot.*

> High Availability Framework for HDFS NN
> ---------------------------------------
>                 Key: HDFS-1623
>                 URL: https://issues.apache.org/jira/browse/HDFS-1623
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Sanjay Radia
>            Assignee: Sanjay Radia
>         Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode HA_v2_1.pdf,
Namenode HA Framework.pdf

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message