curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Jones (JIRA)" <>
Subject [jira] [Created] (CURATOR-62) Leader Election Deadlock
Date Mon, 07 Oct 2013 15:54:42 GMT
Doug Jones created CURATOR-62:

             Summary: Leader Election Deadlock
                 Key: CURATOR-62
             Project: Apache Curator
          Issue Type: Bug
            Reporter: Doug Jones
            Assignee: Jordan Zimmerman

I've noticed that it is possible for a leader election to deadlock if a thread is interrupted
while it is trying to acquire the mutex for the election.

I've created a forced example of this here:

You can see deadlock by using my modified code and running the LeaderSelectorExample. Some
leaders may execute, but on my system I eventually see deadlock. Note that I only see deadlock
when running against a remote zk server rather than the embedded test server. I'm using Zookeeper
3.4.5 on Mac OS X 10.8.4.

>From what I can tell by inspecting the ZK state/watching in the debugger, the thread that
is interrupted is able to successfully create the lock object in ZK. However, due to the interrupt
an exception is generated and LockInternals#internalLockLoop never runs. Later, in LeaderSelector#doWork
when mutex.release() is called this fails at the for lockData.

Once this occurs, the lock object in ZK is the oldest and will cause deadlock.

This message was sent by Atlassian JIRA

View raw message