hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingjie Lai (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2185) HA: HDFS portion of ZK-based FailoverController
Date Wed, 04 Apr 2012 22:52:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246815#comment-13246815

Mingjie Lai commented on HDFS-2185:

Hi Todd. 

You raised good points regarding deployment. 

My proposal was to reduce the complexity of ZKFC. Right now a ZKFC needs to monitor the health
of a NN, watch zookeeper, in addition, it will respond to RPCs from client and peer ZKFCs.
I thought it makes sense to share some responsibilities to client. However I'm also okay with
your design. 

Please help to clarify: 

bq. So, the idea is that, when auto-failover is enabled, all decisions must be made by ZKFCs.
bq. (from the doc) The NameNodes enter a mode such that they only accept mutative HAServiceProtocol
RPCs from ZKFCs

I'm not quite sure how it can be guaranteed. NN cannot be aware of who issues a transition,

bq. no need to introduce "disable/enable" to auto-failover... 
I still think it makes sense to ops to have an option to turn on/off auto failover on-demand.
In case of ZKFC issues, we still can have an alternative way to bypass it. However I'm neither
sure it would help ops or confuse them. 
> HA: HDFS portion of ZK-based FailoverController
> -----------------------------------------------
>                 Key: HDFS-2185
>                 URL: https://issues.apache.org/jira/browse/HDFS-2185
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: auto-failover, ha
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>             Fix For: Auto failover (HDFS-3042)
>         Attachments: Failover_Controller.jpg, hdfs-2185.txt, hdfs-2185.txt, hdfs-2185.txt,
hdfs-2185.txt, hdfs-2185.txt, zkfc-design.pdf, zkfc-design.pdf, zkfc-design.pdf, zkfc-design.pdf,
> This jira is for a ZK-based FailoverController daemon. The FailoverController is a separate
daemon from the NN that does the following:
> * Initiates leader election (via ZK) when necessary
> * Performs health monitoring (aka failure detection)
> * Performs fail-over (standby to active and active to standby transitions)
> * Heartbeats to ensure the liveness
> It should have the same/similar interface as the Linux HA RM to aid pluggability.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message