curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-264) Leader election: Duplicate ephemeral nodes with same owner id
Date Tue, 22 Sep 2015 20:02:04 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903339#comment-14903339
] 

ASF GitHub Bot commented on CURATOR-264:
----------------------------------------

GitHub user Randgalt opened a pull request:

    https://github.com/apache/curator/pull/106

    [CURATOR-264] Duplicate ephemeral nodes with same owner id

    CURATOR-45 added findAndDeleteProtectedNodeInBackground to handle cas…es where a protected
node can get lost. However, the code wasn't correctly handling namespaces

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/curator CURATOR-264

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/curator/pull/106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #106
    
----
commit f8f05be2e097c4c9be65e5110a376d461fd9cd9a
Author: randgalt <randgalt@apache.org>
Date:   2015-09-22T19:59:41Z

    CURATOR-45 added findAndDeleteProtectedNodeInBackground to handle cases where a protected
node can get lost. However, the code wasn't correctly handling namespaces

----


> Leader election: Duplicate ephemeral nodes with same owner id
> -------------------------------------------------------------
>
>                 Key: CURATOR-264
>                 URL: https://issues.apache.org/jira/browse/CURATOR-264
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework, Recipes
>    Affects Versions: 2.8.0
>            Reporter: Ole Hjalmar Herje
>            Assignee: Jordan Zimmerman
>            Priority: Blocker
>             Fix For: 2.9.1
>
>         Attachments: testLog.txt, zkNodes.txt, zkTransactionLog.txt
>
>
> We sometimes experience failure in our leader-election functionality when we have network
issues. When this situation occurs we see that there are two ephemeral nodes in the zookeeper
cluster for the same session but there is no active leader. 
> I have managed to recreate the same scenario by running a test locally and use iptables
to simulate network issues. The debug log (see attachment) shows that findAndDeleteProtectedNodeInBackground
does not delete the node because processResult in FindProtectedNodeCB receives a -101 (NoNode)
resultcode. I suspect this can happen if the read is not synched? (http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkGuarantees)
> This also seems to be related to: 
> https://issues.apache.org/jira/browse/CURATOR-45 and
> https://issues.apache.org/jira/browse/CURATOR-79 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message