curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ole Hjalmar Herje (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CURATOR-264) Leader election: Duplicate ephemeral nodes with same owner id
Date Tue, 22 Sep 2015 06:58:04 GMT
Ole Hjalmar Herje created CURATOR-264:
-----------------------------------------

             Summary: Leader election: Duplicate ephemeral nodes with same owner id
                 Key: CURATOR-264
                 URL: https://issues.apache.org/jira/browse/CURATOR-264
             Project: Apache Curator
          Issue Type: Bug
          Components: Framework, Recipes
    Affects Versions: 2.8.0
            Reporter: Ole Hjalmar Herje


We sometimes experience failure in our leader-election functionality when we have network
issues. When this situation occurs we see that there are two ephemeral nodes in the zookeeper
cluster for the same session but there is no active leader. 

I have managed to recreate the same scenario by running a test locally and use iptables to
simulate network issues. The debug log (see attachment) shows that findAndDeleteProtectedNodeInBackground
does not delete the node because processResult in FindProtectedNodeCB receives a -101 (NoNode)
resultcode. I suspect this can happen if the read is not synched? (http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkGuarantees)

This also seems to be related to: 
https://issues.apache.org/jira/browse/CURATOR-45 and
https://issues.apache.org/jira/browse/CURATOR-79 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message