hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1972) HA: Datanode fencing mechanism
Date Wed, 14 Dec 2011 02:10:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168999#comment-13168999

Todd Lipcon commented on HDFS-1972:

STONITH is one possible fencing mechanism, but requires special hardware support (eg a remotely
controllable PDU or ILOM-like capability on the machine). This addresses the namenode side
of fencing: how do we make sure that a previously active NN can no longer write to the shared
edits storage (ie ensure exclusive access to the new active).

With many storage types there are less drastic fencing methods available - eg filers often
support an operation to fence off a particular IP from a given volume. Software systems like
bookkeeper might support a "lease revoke" operation of sorts (just a guess). So we shouldn't
design STONITH as the only option if we can use other options with less custom hardware necessary.

However, the above NN fencing methods don't deal with the races described here -- the issue
is that the standby necessarily has a stale view of pending deletions in the cluster. We need
to essentially "flush" all deletions from the cluster before the new NN can make appropriate
deletion decisions. This is because block replication decisions are not persisted to the shared
storage. The issues mentioned here are important even in the case of manual transition from
one NN to another.

> HA: Datanode fencing mechanism
> ------------------------------
>                 Key: HDFS-1972
>                 URL: https://issues.apache.org/jira/browse/HDFS-1972
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, ha, name-node
>            Reporter: Suresh Srinivas
>            Assignee: Todd Lipcon
>         Attachments: hdfs-1972-v1.txt, hdfs-1972.txt
> In high availability setup, with an active and standby namenode, there is a possibility
of two namenodes sending commands to the datanode. The datanode must honor commands from only
the active namenode and reject the commands from standby, to prevent corruption. This invariant
must be complied with during fail over and other states such as split brain. This jira addresses
issues related to this, design of the solution and implementation.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message