zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Camille Fournier <cami...@apache.org>
Subject Input on a change
Date Fri, 13 Apr 2012 15:09:34 GMT
Hi everyone,

I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd
like some feedback from the user base on it.

The current behavior of ZK when we get an uncaught exception is to log it
and try to move on. This is arguably not the right thing to do, and will
possibly cause ZK to limp along with a bad VM (say, in an OOM state) for
longer than it should.
The patch proposes that when we get an instance of java.lang.Error, we
should do a system.exit to fast-fail the process. With the possible
exception of ThreadDeath (which may or may not be an unrecoverable system
state depending on the thread), I think this makes sense, but I would like
to hear from others if they have an opinion. I think it's better to kill
the process and let your monitoring services detect process death (and thus
restart) than possibly linger unresponsive for a while, are there scenarios
that we're missing where this error can occur and you wouldn't want the
process killed?

Thanks for your feedback,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message