zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Віталій Тимчишин <tiv...@gmail.com>
Subject Re: Input on a change
Date Sun, 15 Apr 2012 17:13:14 GMT
I really would not like for any library to perform a System.exit call. This
would make huge program exit out of sudden (think about j2ee, you may be
bitten by security manager).  Note that there are more or less safe errors,
like StackOverflowError.
Also System.exit make testing nightmare. E.g. maven2 silently skips any
tests after the one that calls System.exit. And everything's green.
As for me good options are:
1) Call user-provided uncaught exception handler. Use the one from the
thread that created the connection if one is not specified explicity.
1) Stop everything, notifying user with a global watcher. If it's possible,
clean any static state (e.g. restart threads) and allow to restart
connection.
In any case, call user code. Good system already know how to react (it may
want to send email to admin), allow it to perform well.

Best regards, Vitalii Tymchyshyn.

2012/4/13 Camille Fournier <camille@apache.org>

> Hi everyone,
>
> I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd
> like some feedback from the user base on it.
> https://issues.apache.org/jira/browse/ZOOKEEPER-1442
>
> The current behavior of ZK when we get an uncaught exception is to log it
> and try to move on. This is arguably not the right thing to do, and will
> possibly cause ZK to limp along with a bad VM (say, in an OOM state) for
> longer than it should.
> The patch proposes that when we get an instance of java.lang.Error, we
> should do a system.exit to fast-fail the process. With the possible
> exception of ThreadDeath (which may or may not be an unrecoverable system
> state depending on the thread), I think this makes sense, but I would like
> to hear from others if they have an opinion. I think it's better to kill
> the process and let your monitoring services detect process death (and thus
> restart) than possibly linger unresponsive for a while, are there scenarios
> that we're missing where this error can occur and you wouldn't want the
> process killed?
>
> Thanks for your feedback,
>
> Camille
>



-- 
Best regards,
 Vitalii Tymchyshyn

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message