zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Han (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3466) ZK cluster converges, but does not properly handle client connections (new in 3.5.5)
Date Wed, 24 Jul 2019 04:39:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891601#comment-16891601
] 

Michael Han commented on ZOOKEEPER-3466:
----------------------------------------

What's the exact problems on client connection handling here - is the problem being that clients
can't connect to the ensemble, or existing clients can't maintain their session (when switching
from 3.4.x to 3.5.5)?

If it's ok to disclose the log files for both server and client, then someone here might be
able to take a look at.

It would be also helpful to have a relatively straight forward reproduce steps with detailed
version and configuration information for both server and clients. 



> ZK cluster converges, but does not properly handle client connections (new in 3.5.5)
> ------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3466
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3466
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.5.5
>         Environment: Linux
>            Reporter: Jan-Philip Gehrcke
>            Priority: Major
>
> Hey, we explore switching from ZooKeeper 3.4.14 to ZooKeeper 3.5.5 in [https://github.com/dcos/dcos].
> DC/OS coordinates ZooKeeper via Exhibitor. We are not changing anything w.r.t. Exhibitor
for now, and are hoping that we can use ZooKeeper 3.5.5 as a drop-in replacement for 3.4.14.
This seems to work fine when Exhibitor uses a so-called static ensemble where the individual
ZooKeeper instances are known a priori.
> When Exhibitor however discovers individual ZooKeeper instances ("dynamic" back-end)
then I think we observe a regression where ZooKeeper 3.5.5 can get into the following bad
state (often, but not always):
>  # three ZooKeeper instances find each other, leader election takes place (*expected*)
>  # leader election succeeds: two followers, one leader (*expected*)
>  # all three ZK instances respond IAMOK to RUOK  (*expected*)
>  # all three ZK instances respond to SRVR (one says "Mode: leader", the other two say
"Mode: follower")  (*expected*)
>  # all three ZK instances respond to MNTR and show plausible output (*expected*)
>  # *{color:#ff0000}Unexpected:{color}* any ZooKeeper client trying to connect to any
of the three nodes observes a "connection timeout", whereas notably this is *not* a TCP connect()
timeout. The TCP connect() succeeds, but then ZK does not seem to send the expected byte sequence
to the TCP connection, and the ZK client waits for it via recv() until it hits a timeout condition.
Examples for two different clients:
>  ## In Kazoo we specifically hit _Connection time-out: socket time-out during read_
>  generated here: [https://github.com/python-zk/kazoo/blob/88b657a0977161f3815657878ba48f82a97a3846/kazoo/protocol/connection.py#L249]
>  ## In zkCli we see  _Client session timed out, have not heard from server in 15003ms
for sessionid 0x0, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn:main-SendThread(localhost:2181))_
>  # This state is stable, it will last forever (well, at least for multiple hours and
we didn't test longer than that).
>  # In our system the ZooKeeper clients are crash-looping. They retry. What I have observed
is that while they retry the ZK ensemble accumulates outstanding requests, here shown from
MNTR output (emphasis mine): 
>  zk_packets_received 2008
>  zk_packets_sent 127
>  zk_num_alive_connections 18
>  zk_outstanding_requests *1880*
>  # The leader emits log lines confirming session timeout, example:
>  _[myid:3] INFO [SessionTracker:ZooKeeperServer@398] - Expiring session 0x2000642b18f0020,
timeout of 10000ms exceeded [myid:3] INFO [SessionTracker:QuorumZooKeeperServer@157] - Submitting
global closeSession request for session 0x2000642b18f0020_
>  # In this state, restarting any one of the two ZK followers results in the same state
(clients don't get data from ZK upon connect).
>  # In this state, restarting the ZK leader, and therefore triggering a leader re-election,
almost immediately results in all clients being able to connect to all ZK instances successfully.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message