curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henrik Nordvik <henri...@gmail.com>
Subject Re: Switching from State suspended, to lost, to suspended
Date Wed, 13 Nov 2013 14:02:04 GMT
I've upgraded to curator 2.3.0.
LeaderSelector still uses thread interrupting for signaling to the thread
running takeLeadership() to stop, right?
Inside my takeLeadership I do some database operations, and before
commiting I'm checking if I was interrupted, and roll back if I was.
However, some code in between clears the interrupt flag (i.e. logback does
this), so I'm committing even though I lost/suspended the connection.

I need some other criteria to decide if I can commit or not. hasLeadership
only checks a local flag, which is always true inside takeLeadership().
Do I have another flag I can check?


--
Henrik Nordvik


On Tue, Nov 5, 2013 at 5:21 PM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> This sounds like a variation of
> https://issues.apache.org/jira/browse/CURATOR-54 - The next release of
> Curator (later this week) provides a more robust way of canceling
> leadership that doesn’t require thread interruption.
>
> -Jordan
>
> On Nov 5, 2013, at 1:47 AM, Henrik Nordvik <henrikno@gmail.com> wrote:
>
> Hi,
>
> I'm getting some strange behaviour when stopping zookeeper in one
> environment that I can't reproduce locally.
> The result is that the leader selector "quits" even though it is set as
> auto-requeue. (I think that happens because the retry loop inside
> LeaderSelector checks the interrupt-flag, which is set again even when I
> cleared it).
>
> I think it boils down to getting
>
> 2013-11-04 18:22:32,501 INFO  [main-EventThread    ]
> c.n.c.f.state.ConnectionStateManager      - State change: LOST
> 2013-11-04 18:22:32,501 DEBUG [ectionStateManager-0]
> s.f.s.a.feed.MyListener        - Interrupting thread
> Thread[LeaderSelector-0,5,main]
> 2013-11-04 18:22:32,503 INFO  [main-EventThread    ]
> c.n.c.f.state.ConnectionStateManager      - State change: SUSPENDED
> 2013-11-04 18:22:32,504 DEBUG [ectionStateManager-0]
> s.f.s.a.feed.MyListener        - Interrupting thread
> Thread[LeaderSelector-0,5,main]
>
> ... then I handle the interrupt in the leader thread.
>
> Then I get this:
> 2013-11-04 18:22:36,465 INFO  [main-EventThread    ]
> c.n.c.f.state.ConnectionStateManager      - State change: LOST
> 2013-11-04 18:22:36,465 INFO  [main-EventThread    ]
> c.n.c.f.state.ConnectionStateManager      - State change: SUSPENDED
> 2013-11-04 18:22:36,465 DEBUG [ectionStateManager-0]
> s.f.s.a.feed.MyListener        - StateChanged: LOST
> 2013-11-04 18:22:36,465 DEBUG [ectionStateManager-0]
> s.f.s.a.feed.MyListener        - Interrupting thread
> Thread[LeaderSelector-0,5,main]
> 2013-11-04 18:22:36,466 DEBUG [ectionStateManager-0]
> s.f.s.a.feed.MyListener        - StateChanged: SUSPENDED
> 2013-11-04 18:22:36,466 DEBUG [ectionStateManager-0]
> s.f.s.a.feed.MyListener        - Interrupting thread
> Thread[LeaderSelector-0,5,main]
>
>
> Full log is here: https://gist.github.com/zerd/7316258
>
> The code follows the old leader selector example pretty well:
>
>     @Override
>     public void takeLeadership(CuratorFramework curatorFramework) throws
> Exception {
>         ourThread = Thread.currentThread();
>         logger.debug(format("(%s) Got leadership", ourThread));
>         try {
>             waitForAndPerformWork();
>         } catch (InterruptedException e) {
>             logger.debug(format("(%s) Interrupted ", ourThread), e);
>         } finally {
>             logger.debug(format("(%s) No longer leader", ourThread));
>         }
>     }
>
>     @Override
>     public void stateChanged(CuratorFramework curatorFramework,
> ConnectionState newState) {
>         logger.debug("StateChanged: " + newState);
>
>         if ((newState == ConnectionState.LOST) || (newState ==
> ConnectionState.SUSPENDED)) {
>             if (ourThread != null) {
>                 logger.debug("Interrupting thread " + ourThread);
>                 ourThread.interrupt();
>             } else {
>                 logger.debug("Thread is null");
>             }
>         }
>     }
>
> Is it supposed to go back and forth from lost to suspended?
> My goal is to get it to resume trying to get the leadership when zookeeper
> comes back. Do I have to requeue it manually when this happens?
> Would upgrading to latest curator with CancelLeadershipException fix this?
>
> Thank you very much for your time.
>
> --
> Henrik Nordvik
>
>
>

Mime
View raw message