zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flavio Paiva Junqueira (JIRA)" <j...@apache.org>
Subject [jira] Created: (ZOOKEEPER-140) Deadlock in QuorumCnxManager
Date Sat, 13 Sep 2008 10:02:44 GMT
Deadlock in QuorumCnxManager

                 Key: ZOOKEEPER-140
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-140
             Project: Zookeeper
          Issue Type: Bug
            Reporter: Flavio Paiva Junqueira

Frequently the servers deadlock in QuorumCnxManager:initiateConnection on
s.read(msgBuffer) when reading the challenge from the peer.

Calls to initiateConnection and receiveConnection are synchronized, so only one or the other
can be executing at a time. This prevents two connections from opening between the same pair
of servers.

However, it seems that this leads to deadlock, as in this scenario:

A (initiate --> B)
B (initiate --> C)
C (initiate --> A)

initiateConnection can only complete when receiveConnection runs on the remote peer and answers
the challenge. If all servers are blocked in initiateConnection, receiveConnection never runs
and leader election halts.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message