hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2179) HA: namenode fencing mechanism
Date Thu, 04 Aug 2011 00:49:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079155#comment-13079155
] 

Aaron T. Myers commented on HDFS-2179:
--------------------------------------

Patch looks pretty good, Todd. A few comments:

# Please add some comments to the {{FenceMethod}} interface
# I think {{FenceMethod}} should be public. Entirely possible (if not likely) end users will
want to implement their own {{FenceMethods}}, and they shouldn't need to put them in {{o.a.h.hdfs.server.namenode.ha}}.
# Please add some class comments to {{NodeFencer}}.
# Seems to me like {{NodeFencer.fence}} should be catching {{Exception}} thrown by the individual
methods. No reason not to try the other ones if some exception other than {{BadFencingConfigurationException}}
 is thrown.
# In {{SshFenceByTcpPort.getNNPort}}, won't this be getting the port of the NN from where
the SSH is occurring, not necessarily of the NN which is being SSHed into? This sort of points
to what may be a larger problem, which is that I believe it's presently impossible to configure
the addresses of multiple NNs in a single configuration.

> HA: namenode fencing mechanism
> ------------------------------
>
>                 Key: HDFS-2179
>                 URL: https://issues.apache.org/jira/browse/HDFS-2179
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-2179.txt
>
>
> In an HA cluster, when there are two NNs, the invariant that only one NN is active at
a time has to be preserved in order to prevent "split brain syndrome." Thus, when a standby
NN is transition to "active" state during a failover, it needs to somehow _fence_ the formerly
active NN to ensure that it can no longer perform edits. This JIRA is to discuss and implement
NN fencing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message