zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Benediktson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
Date Tue, 22 Aug 2017 14:06:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136834#comment-16136834

Dan Benediktson commented on ZOOKEEPER-2471:

Thanks for the reviews, [~nickt] and [~hanm]. Sorry, I should have been more clear about the
scenario: although I didn't realize it at the time I initially filed this bug (I wasn't aware
of the extent of the divergence of my own fork from mainline), I don't think you can reasonably
run into an infinite connect loop on *current* Apache Zookeeper. On the current code base,
hitting an infinite connect loop basically requires falling victim to the random number generator
repeatedly; not likely at all.

However, if you pick up exponential backup from ZOOKEEPER-2869, it becomes a certainty if
you hit the right conditions: basically, you just need to get enough initial failures to rack
up a large exponential sleep backoff, which in practice happened to us during leader election
on a large ensemble, and then your sleep time can potentially outgrow your connect timeout.
That's somewhat dependent on the backoff algorithm used, but the one we have is pretty reasonable
in how it caps the backoff, and it was certainly still possible there: we had applications
which went down completely since they were unable to connect to ZK until we recycled the application
every time we leader elected the ensemble they talked to, until we fixed this.

So, that's why I suggested in the pull request that maybe a compromise would be to submit
tests along with ZOOKEEPER-2869: a lot of the machinery that's needed to make this testable
is actually stuff that can be introduced as part of the product feature in ZOOKEEPER-2869.

> Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting,
potentially resulting in infinite reconnect loop
> ------------------------------------------------------------------------------------------------------------------------------------------
>                 Key: ZOOKEEPER-2471
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.3
>         Environment: all
>            Reporter: Dan Benediktson
>            Assignee: Dan Benediktson
>         Attachments: ZOOKEEPER-2471.patch
> ClientCnxnSocket uses a member variable "now" to track the current time, and lastSend
/ lastHeard variables to track socket liveness. Implementations, and even ClientCnxn itself,
are expected to call both updateNow() to reset "now" to System.currentTimeMillis, and then
call updateLastSend()/updateLastHeard() on IO completions.
> This is a fragile contract, so it's not surprising that there's a bug resulting from
it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() as soon as startConnect() returns,
but it does not call updateNow() first. I expect when this was written, either the expectation
was that startConnect() was an asynchronous operation and that updateNow() would have been
called very recently, or simply the requirement to call updateNow() was forgotten at this
point. As far as I can see, this bug has been present since the "updateNow" method was first
introduced in the distant past. As it turns out, since startConnect() calls HostProvider.next(),
which can sleep, quite a lot of time can pass, leaving a big gap between "now" and now.
> If you are using very short session timeouts (one of our ZK ensembles has many clients
using a 1-second timeout), this is potentially disastrous, because the sleep time may exceed
the connection timeout itself, which can potentially result in the Java client being stuck
in a perpetual reconnect loop. The exact code path it goes through in this case is complicated,
because there has to be a previously-closed socket still waiting in the selector (otherwise,
the first timeout evaluation will not fail because "now" still hasn't been updated, and then
the actual connect timeout will be applied in ClientCnxnSocket.doTransport()) so that select()
will harvest the IO from the previous socket and updateNow(), resulting in the next loop through
ClientCnxnSocket.SendThread.run() observing the spurious timeout and failing. In practice
it does happen to us fairly frequently; we only got to the bottom of the bug yesterday. Worse,
when it does happen, the Zookeeper client object is rendered unusable: it's stuck in a perpetual
reconnect loop where it keeps sleeping, opening a socket, and immediately closing it.
> I have a patch. Rather than calling updateNow() right after startConnect(), my fix is
to remove the "now" member variable and the updateNow() method entirely, and to instead just
call System.currentTimeMillis() whenever time needs to be evaluated. I realize there is a
benefit (aside from a trivial micro-optimization not worth worrying about) to having the time
be "fixed", particularly for truth in the logging: if time is fixed by an updateNow() call,
then the log for a timeout will still show exactly the same value the code reasoned about.
However, this benefit is in my opinion not enough to merit the fragility of the contract which
led to this (for us) highly impactful and difficult-to-find bug in the first place.
> I'm currently running ant tests locally against my patch on trunk, and then I'll upload
it here.

This message was sent by Atlassian JIRA

View raw message