hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-9220) Unnecessary transition to standby in ActiveStandbyElector
Date Thu, 17 Jan 2013 16:16:18 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Tom White updated HADOOP-9220:

    Attachment: HADOOP-9220.patch

The reason for this behaviour is because there can be multiple watchers registered for a given
ZK client in ActiveStandbyElector. (The monitorLockNodeAsync() method creates a new watcher
object for the existing ZK client.) 

This can cause multiple invocations of joinElectionInternal() for a single watch event, each
of which will make a call to create the lock znode. The first call will cause the a transition
to active, while subsequent ones will cause a transition to standby (in the isNodeExists clause
of the  processResult() method). In a manual failover scenario the node will still transition
to active again, since the other node has ceded from the election for 10s, but it's still
an unnecessary transition that could be eliminated.

I did some manual testing with the attached patch, and the extra transition was avoided. I'll
see if I can write a unit test for it.

> Unnecessary transition to standby in ActiveStandbyElector
> ---------------------------------------------------------
>                 Key: HADOOP-9220
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9220
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: HADOOP-9220.patch
> When performing a manual failover from one HA node to a second, under some circumstances
the second node will transition from standby -> active -> standby -> active. This
is with automatic failover enabled, so there is a ZK cluster doing leader election.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message