curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-324) TreeCache sending multiple NODE_ADDED events for same node
Date Thu, 19 May 2016 22:21:12 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292256#comment-15292256
] 

ASF GitHub Bot commented on CURATOR-324:
----------------------------------------

Github user Randgalt commented on the pull request:

    https://github.com/apache/curator/pull/148#issuecomment-220468433
  
    I did some further debugging and the problem has something to do with the same node being
deleted/updated/recreated in quick succession. I'm concerned that the node states are handle
by individual TreeNode instances. Look at this log (filtered for node "/root/0"):
    
    ```
    2133 processResult: @Node1 CuratorEventImpl{type=GET_DATA, resultCode=0, path='/root/0',
name='null', children=null, context=null, stat=5135,5153,1463694250431,1463694250434,2,0,0,0,3,0,5135,
data=[110, 101, 119], watchedEvent=null, aclList=null} [main-EventThread]
    2133 publishEvent: @Node1 /root/0 NODE_ADDED [main-EventThread]
    2134 processResult: @Node1 CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root/0',
name='null', children=[], context=null, stat=5135,5153,1463694250431,1463694250434,2,0,0,0,3,0,5135
    2135 processResult: @root CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root',
name='null', children=[3, 0, 7, 9, 8], context=null, stat=2,2,1463694248754,1463694248754,0,1729,0,0,9,5,5154
    2136 process: @Node1 WatchedEvent state:SyncConnected type:NodeDataChanged path:/root/0
[main-EventThread]
    2136 process: @Node1 WatchedEvent state:SyncConnected type:NodeDeleted path:/root/0 [main-EventThread]
    2136 publishEvent: @Node1 /root/0 NODE_REMOVED [main-EventThread]
    2154 processResult: @root CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root',
name='null', children=[0, 6, 5, 4, 8], context=null, stat=2,2,1463694248754,1463694248754,0,1765,0,0,9,5,5270
    2155 processResult: @Node1 CuratorEventImpl{type=GET_DATA, resultCode=0, path='/root/0',
name='null', children=null, context=null, stat=5268,5268,1463694250454,1463694250454,0,0,0,0,9,0,5268
    2155 publishEvent: @Node1 /root/0 NODE_ADDED [main-EventThread]
    2167 processResult: @Node2 CuratorEventImpl{type=GET_DATA, resultCode=-101, path='/root/0',
name='null', children=null, context=null, stat=null, data=null, watchedEvent=null, aclList=null}
[main-EventThread]
    2167 wasDeleted @Node2 /root/0 PENDING [main-EventThread]
    2167 processResult: @Node2 CuratorEventImpl{type=CHILDREN, resultCode=-101, path='/root/0',
name='null', children=[], context=null, stat=null, data=null, watchedEvent=null, aclList=null}
[main-EventThread]
    2167 wasDeleted @Node2 /root/0 DEAD [main-EventThread]
    2183 processResult: @root CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root',
name='null', children=[3, 7, 5, 9, 8], context=null, stat=2,2,1463694248754,1463694248754,0,1805,0,0,9,5,5380
    2233 processResult: @root CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root',
name='null', children=[0, 7, 5, 4], context=null, stat=2,2,1463694248754,1463694248754,0,1842,0,0,9,4,5484
    2259 processResult: @Node3 CuratorEventImpl{type=GET_DATA, resultCode=-101, path='/root/0',
name='null', children=null, context=null, stat=null, data=null, watchedEvent=null, aclList=null}
[main-EventThread]
    2259 wasDeleted @Node3 /root/0 PENDING [main-EventThread]
    2259 processResult: @Node3 CuratorEventImpl{type=CHILDREN, resultCode=-101, path='/root/0',
name='null', children=[], context=null, stat=null, data=null, watchedEvent=null, aclList=null}
[main-EventThread]
    2260 wasDeleted @Node3 /root/0 DEAD [main-EventThread]
    2261 processResult: @root CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root',
name='null', children=[7, 6, 9], context=null, stat=2,2,1463694248754,1463694248754,0,1875,0,0,9,3,5604
    2273 processResult: @root CuratorEventImpl{type=CHILDREN, resultCode=0, path='/root',
name='null', children=[1, 0, 5, 9], context=null, stat=2,2,1463694248754,1463694248754,0,1906,0,0,9,4,5706
    2282 processResult: @Node4 CuratorEventImpl{type=GET_DATA, resultCode=0, path='/root/0',
name='null', children=null, context=null, stat=5797,5811,1463694250589,1463694250591,1,0,0,0,3,0,5797
    2282 publishEvent:@Node4 /root/0 NODE_ADDED [main-EventThread]
    ```
    
    * The first column is time
    * The second column is the operation
    * The third is the TreeNode doing the operation (I made the name human readable) 
    * The last bit is the event - wasDeleted means that wasDeleted() did not publish NODE_REMOVED
because the node state was not LIVE
    
    The problem here is the two NODE_ADDED publishes at 2155 and 2282. I believe there is
a missing NODE_REMOVED event at 2260.
    
    Note, at time 2136 we get watcher process() NodeDataChanged immediately followed by a
NodeDeleted on @Node1's watcher. 
    @Node1 then publishes NODE_REMOVED (correctly) at 2136.
    
    At 2154, @root gets the result of calling getChildren. I assume that "/root/0" was recreated
and @root's watcher got notified. Note that
    "0" is in the children list. So, this will cause a new node to get created and its wasCreated()
called. At 2155, the old @Node1 gets the
    result of GET_DATA. This causes a false NODE_ADDED to get generated. @Node1 is officially
dead and should not be reporting this.
    
    I believe the sequence of events is: Just prior to 2136, 1 test code thread changes "/root/0"
and another deletes it and another recreates it almost immediately.
    @Node1's watcher is triggered for NodeDataChanged and begins an async call to getData(),
it then receives NodeDeleted BEFORE the async getData() occurs on the server.
    The node then gets recreated. The getData() finally executes on the server and the result
is returned such that @Node1 misses the node deletion.
    
    Anyway, I have a potential fix for now. In processResults() GET_DATA case, add:
    
    ```
    if ( nodeState.get() == NodeState.DEAD )
    {
        break;
    }
    ```
    
    This fixes the problem. Longer term, we should consider moving node management into the
main TreeCache instance and not the individual node objects.



> TreeCache sending multiple NODE_ADDED events for same node
> ----------------------------------------------------------
>
>                 Key: CURATOR-324
>                 URL: https://issues.apache.org/jira/browse/CURATOR-324
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 3.1.0, 2.10.0
>            Reporter: Jordan Zimmerman
>            Assignee: Scott Blum
>
> details coming in the next 10 min - I will push a test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message