hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2179) HA: namenode fencing mechanism
Date Mon, 25 Jul 2011 17:46:10 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070631#comment-13070631

Suresh Srinivas commented on HDFS-2179:

Case 1), where active standby are in communication and co-operating does not require fencing
at all. Fencing is required only when active/standby cannot communicate. So we should drop
that out of cases to consider.

When using solutions such as LinuxHA, a local process (LRM) kills the process to be fenced.
This does not require ssh to the node. HDFS-2185 should consider this requirement. I might
start with LinuxHA to play around with this, in the first phase, since I think getting a rock
solid and correct fail-over controller is non-trivial.

> HA: namenode fencing mechanism
> ------------------------------
>                 Key: HDFS-2179
>                 URL: https://issues.apache.org/jira/browse/HDFS-2179
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
> In an HA cluster, when there are two NNs, the invariant that only one NN is active at
a time has to be preserved in order to prevent "split brain syndrome." Thus, when a standby
NN is transition to "active" state during a failover, it needs to somehow _fence_ the formerly
active NN to ensure that it can no longer perform edits. This JIRA is to discuss and implement
NN fencing.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message