zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Austin" <Austin.Mil...@morganstanley.com>
Subject RE: exposing lastSend
Date Fri, 11 Jul 2014 18:18:23 GMT
> Pings, from an idle client, actually need to go out every 1/3 of
> negotiatedSessionTimeout.

If the client send thread isn't being scheduled, then this wouldn't happen.

> 1/3 of negotiatedSessionTimeout will already cause a ConnectionLoss...

> No, the ZK server *will* RST the client of if it hasn't ping in 1/3 of

Ok, but this still doesn't prevent a race between the transaction being committed and events
firing from the client event thread after the freeze window.  In fact, you have convinced
me the situation is more likely because now all threads not being scheduled for a third of
negotiatedSessionTimeout (down from greater than) is sufficient to encounter the problem.

> You could just release the lock as soon as you receive ConnectionLoss
> (i.e.: without waiting for SessionExpired, which you'll only get upon
> reconnecting to a ZK server.. which could take longer, given a partition or
> loaded network). But the case you are exposing is conflated with the
> pathological scenario of a JVM instance starving it's threads... if that's
> a risk, you might as well have an external health-check process that kills
> your JVM entirely once  it's likely that the ZK thread might be starving
> (hence, losing your lock being more likely).

The lock is represented by the existence of an ephemeral node.  I can't release it, it is
already released because of session death in this scenario and will have been grabbed by another
JVM process somewhere else.  

I don't know what "pathological" means.   If it means nobody should care about this situation,
then I must politely disagree.  I accept that it is possible that very few people should care
about it, but I'm not even sure about that.

Your suggestion is to respond to an effort to increase consistency is to have yet another
process that completely kills the current one instead of dealing with the  issue in a programmatic
way?  What if the health check process dies?  How does it perform this health check consistently?
 When performing this health check, how does it do it?  Does it keep track of the scheduling
of every thread and require deep understanding of the kernel the JVM is running on?  Does
it require root access as a result?  Does it create false positive situations where it can't
be sure that the process is able to keep the session alive and so it aggressively kills it
even though it was keeping the session alive?  Does it not increase the chances of failover
from one process currently holding the lock to another process acquiring it (undesirable)?
 If the rss of the JVM is increasing because of leaks by classloaders and the guard process
can't allocate sufficient memory to kill the JVM for being unhealthy, then what happens? How
do you deploy and test it so that it works?  What if the code is being used by a wide variety
of users, how do you instruct them to manage/deploy/configure this guard process? What if
the process is doing other things that don't need to be killed and I would really, really
like for those to complete but not the transaction that depends on the ZK lock?  It does not
seem or smell like a proper solution.

I recognize that saying it would be relatively trivial to expose the lastSend value is not
a good argument because adding to a popular contract should be a deliberate action.  Even
so, the code itself would be trivial to expose this value (can be done in 10 mins), so it
is not an attempt at changing the way ZK works but rather I wish to make an argument that
exposing the value has use to someone, even if he is pathological. :)  Possibly it would be
useful to other people.  For instance, I suspect it would be useful to a library like curator.



NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained
herein are not intended to be, and do not constitute, advice within the meaning of Section
975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received
this communication in error, please destroy all electronic and paper copies and notify the
sender immediately. Mistransmission is not intended to waive confidentiality or privilege.
Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor
electronic communications. This message is subject to terms available at the following link:
http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify
us by reply message and we will send the contents to you. By messaging with Morgan Stanley
you consent to the foregoing.
View raw message