hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2185) HA: HDFS portion of ZK-based FailoverController
Date Mon, 02 Apr 2012 04:23:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243969#comment-13243969
] 

Todd Lipcon commented on HDFS-2185:
-----------------------------------

bq. I am surprised that only an inElection state suffices instead of an inElectionActive and
inElectionStandby. Arent there actions that need to be performed differently when the FC is
active or standby?

It's somewhat surprising, but it seems to work fine this way. Do you see any issues?

bq. There is no state transition dependent on the result of transitionToActive(). If transitionToActive
on NN fails then FC should quitElection IMO. Currently, it quits election only on HM events.
Same for transitionToStandby on NN. If that fails then should we not do something?
The new rev of the document adds more explanation of this.

bq. Which state is performing fencing? The state machine does not show fencing? Is it happening
in the the HAService?
It's a callback from the elector. New rev of the document shows the callback.
                
> HA: HDFS portion of ZK-based FailoverController
> -----------------------------------------------
>
>                 Key: HDFS-2185
>                 URL: https://issues.apache.org/jira/browse/HDFS-2185
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: auto-failover, ha
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>         Attachments: Failover_Controller.jpg, hdfs-2185.txt, hdfs-2185.txt, hdfs-2185.txt,
hdfs-2185.txt, zkfc-design.pdf, zkfc-design.pdf, zkfc-design.pdf, zkfc-design.tex
>
>
> This jira is for a ZK-based FailoverController daemon. The FailoverController is a separate
daemon from the NN that does the following:
> * Initiates leader election (via ZK) when necessary
> * Performs health monitoring (aka failure detection)
> * Performs fail-over (standby to active and active to standby transitions)
> * Heartbeats to ensure the liveness
> It should have the same/similar interface as the Linux HA RM to aid pluggability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message