zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@cloudera.com>
Subject Re: [3.4.6] Ephemeral node not deleted after session is gone
Date Tue, 03 Apr 2018 09:26:08 GMT
There're a few questions on the original thread which might be useful to
answer here as well:

1) Why is the session closed, the client closed it or the cluster expired
it?

2) which server was the session attached to - the first (44sec max
lat) or one of the others? Which server was the leader?

3) the znode exists on all 4 servers, is that right?

Would also be useful to attach server logs related to the session
expiration as well as LogFormatter output of txn log files about the nodes.

Regards,
Andor


On Tue, Apr 3, 2018 at 10:34 AM, Andor Molnar <andor@cloudera.com> wrote:

> Hi Daniel,
>
> Thanks for the bugreport.
> Interesting that this issue should have been fixed already by ages:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1208
>
> Regards,
> Andor
>
>
> On Tue, Apr 3, 2018 at 3:22 AM, Daniel Chan <daniel.cw.chan@oracle.com>
> wrote:
>
>> We have a live Zookeeper environment (quorum size is 2) and observed a
>> strange behavior:
>> Kafka created 2 ephemeral nodes /brokers/ids/822712429 and
>> /brokers/ids/707577499 on 2018-03-12 03:30:36.933
>> The Kafka clients were long gone but as of today, the two ephemeral nodes
>> are still present
>>
>> Troubleshooting:
>> 1) Lists the outstanding sessions and ephemeral nodes
>> $ echo dump | nc $SERVER1 2181
>> SessionTracker dump:
>> org.apache.zookeeper.server.quorum.LearnerSessionTracker@6d7fd863
>> ephemeral nodes dump:
>> Sessions with Ephemerals (2):
>> 0x162183ea9f70003:
>>         /brokers/ids/822712429
>> 0x162183ea9f70002:
>>         /brokers/ids/707577499
>>         /controller
>>
>> 2) stat on /brokers/ids/822712429
>> zk> stat /brokers/ids/822712429
>> czxid: 4294967344
>> mzxid: 4294967344
>> pzxid: 4294967344
>> ctime: 1520825436933 (2018-03-11T20:30:36.933-0700)
>> mtime: 1520825436933 (2018-03-11T20:30:36.933-0700)
>> version: 0
>> cversion: 0
>> aversion: 0
>> owner: 99668799174148099
>> datalen: 102
>> children: 0
>>
>> 3) List full connection/session details for all clients connected
>> $ echo cons | nc $SERVER1 2181
>>  /10.247.114.70:30401[0](queued=0,recved=1,sent=0)
>>  /10.248.88.235:40430[1](queued=0,recved=345,sent=345,sid=
>> 0x162183ea9f70c22,lop=PING,est=1522713395028,to=40000,
>> lcxid=0x12,lzxid=0xffffffffffffffff,lresp=1522717802117,
>> llat=0,minlat=0,avglat=0,maxlat=31)
>>
>> $ echo cons | nc $SERVER2 2181
>>  /10.196.18.61:28173[0](queued=0,recved=1,sent=0)
>>  /10.247.114.69:42679[1](queued=0,recved=73800,sent=73800,
>> sid=0x262183eaa21da96,lop=PING,est=1522651352906,to=9000
>> ,lcxid=0xe49f,lzxid=0x10004683d,lresp=1522717854847,llat=0,
>> minlat=0,avglat=0,maxlat=1235)
>>
>> 4) health
>> $ echo mntr | nc $SERVER1 2181
>> zk_version      3.4.6-1569965, built on 02/20/2014 09:09 GMT
>> zk_avg_latency  0
>> zk_max_latency  443
>> zk_min_latency  0
>> zk_packets_received     11158019
>> zk_packets_sent 11158244
>> zk_num_alive_connections        2
>> zk_outstanding_requests 0
>> zk_server_state follower
>> zk_znode_count  344
>> zk_watch_count  0
>> zk_ephemerals_count     3
>> zk_approximate_data_size        36654
>> zk_open_file_descriptor_count   33
>> zk_max_file_descriptor_count    65536
>>
>> 5) Could not find any special exception from zookeeper logs about the two
>> sessions
>>
>> Is this a known bug in version 3.4.6? what could be the potential cause
>> of the issue?
>>
>> Thanks,
>> Daniel
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message