hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Mankude (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8163) Improve ActiveStandbyElector to provide hooks for fencing old active
Date Wed, 14 Mar 2012 22:40:39 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229711#comment-13229711

Hari Mankude commented on HADOOP-8163:

Thanks Todd for answers.

I had a suggestion. One of the issues with automatic failover is how does a NN become active
when the other node is not available or when both NNs are restarting. If there is a persistant
info znode, then it provides a very powerful invariant.

1. If info znode is present, then only the owner of info znode can be made active by preference.
Alternatively, the other NN can be made active, provided it can capture all the state from
the previously active znode. If the other NN cannot get all the edit logs from the previous
active, it remains in standby state. An admin initiated action can force the takeover with
data loss.

2. If info znode is absent, then it is fair game and both nodes can be made active.

For this, couple of high level changes would be necessary in the patch.
1. info znode is not cleaned up by the active nn during clean shutdown. (sure, this results
in unnecessary fencing)
2. after an ephemeral lock takeover and before the previous info znode can be deleted, a state
equalization by getting all the editlogs is done by the become_active() on the new node.
> Improve ActiveStandbyElector to provide hooks for fencing old active
> --------------------------------------------------------------------
>                 Key: HADOOP-8163
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8163
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ha
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8163.txt
> When a new node becomes active in an HA setup, it may sometimes have to take fencing
actions against the node that was formerly active. This JIRA extends the ActiveStandbyElector
which adds an extra non-ephemeral node into the ZK directory, which acts as a second copy
of the active node's information. Then, if the active loses its ZK session, the next active
to be elected may easily locate the unfenced node to take the appropriate actions.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message