curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: Switching from State suspended, to lost, to suspended
Date Tue, 05 Nov 2013 16:21:07 GMT
This sounds like a variation of https://issues.apache.org/jira/browse/CURATOR-54 - The next
release of Curator (later this week) provides a more robust way of canceling leadership that
doesn’t require thread interruption.

-Jordan

On Nov 5, 2013, at 1:47 AM, Henrik Nordvik <henrikno@gmail.com> wrote:

> Hi,
> 
> I'm getting some strange behaviour when stopping zookeeper in one environment that I
can't reproduce locally.
> The result is that the leader selector "quits" even though it is set as auto-requeue.
(I think that happens because the retry loop inside LeaderSelector checks the interrupt-flag,
which is set again even when I cleared it).
> 
> I think it boils down to getting
> 
> 2013-11-04 18:22:32,501 INFO  [main-EventThread    ] c.n.c.f.state.ConnectionStateManager
     - State change: LOST
> 2013-11-04 18:22:32,501 DEBUG [ectionStateManager-0] s.f.s.a.feed.MyListener        -
Interrupting thread Thread[LeaderSelector-0,5,main]
> 2013-11-04 18:22:32,503 INFO  [main-EventThread    ] c.n.c.f.state.ConnectionStateManager
     - State change: SUSPENDED
> 2013-11-04 18:22:32,504 DEBUG [ectionStateManager-0] s.f.s.a.feed.MyListener        -
Interrupting thread Thread[LeaderSelector-0,5,main]
> 
> ... then I handle the interrupt in the leader thread.
> 
> Then I get this:
> 2013-11-04 18:22:36,465 INFO  [main-EventThread    ] c.n.c.f.state.ConnectionStateManager
     - State change: LOST
> 2013-11-04 18:22:36,465 INFO  [main-EventThread    ] c.n.c.f.state.ConnectionStateManager
     - State change: SUSPENDED
> 2013-11-04 18:22:36,465 DEBUG [ectionStateManager-0] s.f.s.a.feed.MyListener        -
StateChanged: LOST 
> 2013-11-04 18:22:36,465 DEBUG [ectionStateManager-0] s.f.s.a.feed.MyListener        -
Interrupting thread Thread[LeaderSelector-0,5,main]
> 2013-11-04 18:22:36,466 DEBUG [ectionStateManager-0] s.f.s.a.feed.MyListener        -
StateChanged: SUSPENDED 
> 2013-11-04 18:22:36,466 DEBUG [ectionStateManager-0] s.f.s.a.feed.MyListener        -
Interrupting thread Thread[LeaderSelector-0,5,main]
> 
> 
> Full log is here: https://gist.github.com/zerd/7316258
> 
> The code follows the old leader selector example pretty well:
> 
>     @Override
>     public void takeLeadership(CuratorFramework curatorFramework) throws Exception {
>         ourThread = Thread.currentThread();
>         logger.debug(format("(%s) Got leadership", ourThread));
>         try {
>             waitForAndPerformWork();
>         } catch (InterruptedException e) {
>             logger.debug(format("(%s) Interrupted ", ourThread), e);
>         } finally {
>             logger.debug(format("(%s) No longer leader", ourThread));
>         }
>     }
> 
>     @Override
>     public void stateChanged(CuratorFramework curatorFramework, ConnectionState newState)
{
>         logger.debug("StateChanged: " + newState);
> 
>         if ((newState == ConnectionState.LOST) || (newState == ConnectionState.SUSPENDED))
{
>             if (ourThread != null) {
>                 logger.debug("Interrupting thread " + ourThread);
>                 ourThread.interrupt();
>             } else {
>                 logger.debug("Thread is null");
>             }
>         }
>     }
> 
> Is it supposed to go back and forth from lost to suspended?
> My goal is to get it to resume trying to get the leadership when zookeeper comes back.
Do I have to requeue it manually when this happens?
> Would upgrading to latest curator with CancelLeadershipException fix this?
> 
> Thank you very much for your time.
> 
> --
> Henrik Nordvik


Mime
View raw message