hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2185) HA: HDFS portion of ZK-based FailoverController
Date Wed, 04 Apr 2012 05:15:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246045#comment-13246045

Todd Lipcon commented on HDFS-2185:

bq. I think you are missing the failure arc when transitionToStandby is called in InElection

This failure handling isn't necessary, since any failure to transition will be shortly followed
by the health monitor entering a bad state. This is because a failure in state transition
causes the NN to abort.

bq. Is there any scope for admin operations in ZKFC. Will ZKFC receive and accept a signal
(manual admin/auto machine reboot) to stop services? At that point, in InElection state, how
will it know that it needs to send transitionToStandby or not (based on whether it is active
or not)?

I just added a section regarding manual failover operation in conjunction with automatic.
I was hoping we could merge the automatic work back to trunk prior to adding this feature,
though, treating it as an improvement.
> HA: HDFS portion of ZK-based FailoverController
> -----------------------------------------------
>                 Key: HDFS-2185
>                 URL: https://issues.apache.org/jira/browse/HDFS-2185
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: auto-failover, ha
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>             Fix For: Auto failover (HDFS-3042)
>         Attachments: Failover_Controller.jpg, hdfs-2185.txt, hdfs-2185.txt, hdfs-2185.txt,
hdfs-2185.txt, hdfs-2185.txt, zkfc-design.pdf, zkfc-design.pdf, zkfc-design.pdf, zkfc-design.pdf,
> This jira is for a ZK-based FailoverController daemon. The FailoverController is a separate
daemon from the NN that does the following:
> * Initiates leader election (via ZK) when necessary
> * Performs health monitoring (aka failure detection)
> * Performs fail-over (standby to active and active to standby transitions)
> * Heartbeats to ensure the liveness
> It should have the same/similar interface as the Linux HA RM to aid pluggability.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message