curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Moulliet (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-104) LeaderSelector issue after losing ZooKeeper leader
Date Sat, 19 Apr 2014 22:51:14 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974991#comment-13974991
] 

Greg Moulliet commented on CURATOR-104:
---------------------------------------

I was able to create a unit test which showed the same behavior.

When the session is lost due to the ZK being stopped, an InterruptedException is thrown up
the stack.
LeaderSelector.doWork was silently catching the exception and interupting the thread, preventing
doWorkLoop from continuing.

Attached is a patch which includes the unit test and a change to LeaderSelector which fixes
the issue for the unit test as well as my example.  Also, with the example code, I’d not
had log4j setup correctly, so I only saw System.out messages.


> LeaderSelector issue after losing ZooKeeper leader
> --------------------------------------------------
>
>                 Key: CURATOR-104
>                 URL: https://issues.apache.org/jira/browse/CURATOR-104
>             Project: Apache Curator
>          Issue Type: Bug
>    Affects Versions: 2.4.1
>            Reporter: Greg Moulliet
>            Assignee: Jordan Zimmerman
>         Attachments: lost_leadership_example.patch, lost_leadership_test_and_fix.patch
>
>
> LeaderSelectors are not re-attempting leadership after a ZooKeeper leader is stopped
and a client with leadership is stopped.
> I have a client process running on 2 servers.  Each process is using LeaderSelectors
for the same set of leaderPaths.  
> The scenario:
> 1 - Both clients running, with one client being the leader of each path (2 children are
under each leaderPath)
> 2 - Stop the ZooKeeper leader
> 3 - All clients temporarily lose leadership (0 children are under each leaderPath)
> 4 - Leadership is regained by the same clients that had leadership in step 1 (1 child
is under each leaderPath)
> 5 - Stop a client with leadership
> 6 - No other clients pick up leadership of the leaderPaths from step 5 (0 children are
under each leaderPath)
> Sometimes, a client will pick up one of the leaderPaths, but not more than one.
> I’m using Curator 2.4.1 and ZooKeeper 3.4.5.  
> I originally saw the issue with Curator 2.3.0, and was hoping it was the same as https://issues.apache.org/jira/browse/CURATOR-73.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message