zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lindwall <john.lindw...@gmail.com>
Subject Re: Ephemeral znodes not getting removed
Date Mon, 05 Aug 2019 20:15:56 GMT
Thanks for the response! My direct access to this zk cluster is 
limited.  I'll see about getting a copy of the logs to examine.  I'll 
also try to coordinate your experiment of creating a znode in each node 
in turn and checking the cluster-wide view of that data.  If we see a 
situation where the "global view" is inconsistent what would be the next 
step?

I did receive output from each cluster node containing the results of 
these 4-letter words: dump, cons, mntr, and stat.  For one of the 
ephemerals in question we could see a record of it in the "dump" output 
for one of the 3 cluster nodes (the leader) but not in the other 2 nodes 
dump output.  Weirdly, the session id associated with that ephemeral 
znode does not appear in the "cons" output for any of the cluster 
members.  So this appears to be an ephemeral that has survived the 
termination of its associated zk session (!?)

Thanks for any advice or feedback,
John

Patrick Hunt wrote on 8/2/19 9:38 AM:
> The jira you ref'd is the only one that comes to mind. In terms of
> troubleshooting - try connecting a client to each of the servers in tern
> and see if it's a situation where they have a different view of the world
> wrt those znodes. You might also have the client create separate znodes on
> each server and ensure that they are consistent. The logs are also
> typically a good source of information - check against the session id.
>
> Patrick
>
> On Wed, Jul 31, 2019 at 5:54 PM John Lindwall <john.lindwall@gmail.com>
> wrote:
>
>> ZooKeeper 3.4.6-1569965
>>
>> In our environment we seem to have a situation where ephemeral znodes
>> are not getting removed after the zookeeper session has been
>> terminated.  We can see examples of znodes that were created 3-4 days
>> past that still exist, though the zk sessions bound to those znodes
>> should no longer exist.
>>
>> Note that we've had this cluster running to about 4 years and have not
>> seen this problem until recently.
>>
>> 1. I am wondering if there are any known issues that would affect our
>> zookeeper version that may cause this behavior?
>> 2. Is it possible our servers are simply in a "bad state" and a simple
>> reboot might clean things up?
>> 3. Any tips on diagnosing this?
>>
>> We noticed this issue from 2011 but that seems to have been fixed in our
>> branch.
>>
>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1208>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-1208
>>
>> Thanks,
>> John Lindwall
>>

-- 
Sent from Postbox 
<https://www.postbox-inc.com/?utm_source=email&utm_medium=siglink&utm_campaign=reach>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message