hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flavio Paiva Junqueira (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (ZOOKEEPER-140) Deadlock in QuorumCnxManager
Date Wed, 01 Oct 2008 09:21:44 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Flavio Paiva Junqueira resolved ZOOKEEPER-140.
----------------------------------------------

    Resolution: Fixed

This issue has been resolved by the patch of 127, which has been committed.

> Deadlock in QuorumCnxManager
> ----------------------------
>
>                 Key: ZOOKEEPER-140
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-140
>             Project: Zookeeper
>          Issue Type: Bug
>            Reporter: Flavio Paiva Junqueira
>
> Frequently the servers deadlock in QuorumCnxManager:initiateConnection on
> s.read(msgBuffer) when reading the challenge from the peer.
> Calls to initiateConnection and receiveConnection are synchronized, so only one or the
other can be executing at a time. This prevents two connections from opening between the same
pair of servers.
> However, it seems that this leads to deadlock, as in this scenario:
> {noformat}
> A (initiate --> B)
> B (initiate --> C)
> C (initiate --> A)
> {noformat}
> initiateConnection can only complete when receiveConnection runs on the remote peer and
answers the challenge. If all servers are blocked in initiateConnection, receiveConnection
never runs and leader election halts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message