Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3EE191DB for ; Mon, 16 Apr 2012 03:20:50 +0000 (UTC) Received: (qmail 90139 invoked by uid 500); 16 Apr 2012 03:20:49 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 89737 invoked by uid 500); 16 Apr 2012 03:20:49 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 89703 invoked by uid 99); 16 Apr 2012 03:20:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2012 03:20:48 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ishaaq@gmail.com designates 74.125.82.46 as permitted sender) Received: from [74.125.82.46] (HELO mail-wg0-f46.google.com) (74.125.82.46) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2012 03:20:42 +0000 Received: by wgbdq11 with SMTP id dq11so5078392wgb.15 for ; Sun, 15 Apr 2012 20:20:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=sJnXT3ZEwV1GcJ1NE/JSbqCnzl9GZYKXqqWErcyE5Ik=; b=XfYIag8cuPlxLHTVfMQ26T0mFzlWgqRDJtv7Tk1rmdkXaRrV6ooklLag7H9kZ0ViJs XMUNMcMW0AiDjnC/AXv8kAg/bGRRLKqeGCWlJe3D1ygFD1dxX4APIKn4ovXhi30bquIX aqBwwgxX2+ako2YfHG9lclKKb0ccIe5OGas2eLu0A/ivOQ74Br+rkZW6IzixwCwnJo5O bXWHSwubvMM43OyieSvY/mKKd8tqRphnD1YEKSuXAEBh1+7gYXE4JWHR7byez57H9FvF L95zltl7KFLpg84FlyPFOLgNagXaWmUvsjmPWxT/P1wd28aXA7r95gJY5zxFPxlWPizq 0/dg== MIME-Version: 1.0 Received: by 10.180.95.129 with SMTP id dk1mr17002342wib.3.1334546420732; Sun, 15 Apr 2012 20:20:20 -0700 (PDT) Received: by 10.223.161.3 with HTTP; Sun, 15 Apr 2012 20:20:20 -0700 (PDT) In-Reply-To: References: Date: Mon, 16 Apr 2012 13:20:20 +1000 Message-ID: Subject: Re: Input on a change From: Ishaaq Chandy To: user@zookeeper.apache.org Cc: zookeeper-user@hadoop.apache.org, zookeeper-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d0444ee2d58148e04bdc34d6f X-Virus-Checked: Checked by ClamAV on apache.org --f46d0444ee2d58148e04bdc34d6f Content-Type: text/plain; charset=KOI8-U Content-Transfer-Encoding: quoted-printable I'd go so far as to say that even the server-code should avoid System.exit. Just because it is "meant" to be a standalone system doesn't mean that code that makes it impossible to embed it should be encouraged. For e.g, we embed a local version of ZK to be used inside our unit tests. This makes it much easier for us to control ZK to coincide with test expectations as well as making for much faster build times. It would be a shame if the embedded ZK started killing the JVM. Ishaaq On 16 April 2012 04:28, Camille Fournier wrote: > This is a good point. > I think this change should be fine for the server portion of the code, > since it's designed to be run as a standalone system. But for the > client connection to also call system.exit on such an error is > overreaching for all the reasons listed below. > > C > > 2012/4/15 =F7=A6=D4=C1=CC=A6=CA =F4=C9=CD=DE=C9=DB=C9=CE : > > I really would not like for any library to perform a System.exit call. > This > > would make huge program exit out of sudden (think about j2ee, you may b= e > > bitten by security manager). Note that there are more or less safe > errors, > > like StackOverflowError. > > Also System.exit make testing nightmare. E.g. maven2 silently skips any > > tests after the one that calls System.exit. And everything's green. > > As for me good options are: > > 1) Call user-provided uncaught exception handler. Use the one from the > > thread that created the connection if one is not specified explicity. > > 1) Stop everything, notifying user with a global watcher. If it's > possible, > > clean any static state (e.g. restart threads) and allow to restart > > connection. > > In any case, call user code. Good system already know how to react (it > may > > want to send email to admin), allow it to perform well. > > > > Best regards, Vitalii Tymchyshyn. > > > > 2012/4/13 Camille Fournier > > > >> Hi everyone, > >> > >> I'm trying to evaluate a patch that Jeremy Stribling has submitted, an= d > I'd > >> like some feedback from the user base on it. > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442 > >> > >> The current behavior of ZK when we get an uncaught exception is to log > it > >> and try to move on. This is arguably not the right thing to do, and wi= ll > >> possibly cause ZK to limp along with a bad VM (say, in an OOM state) f= or > >> longer than it should. > >> The patch proposes that when we get an instance of java.lang.Error, we > >> should do a system.exit to fast-fail the process. With the possible > >> exception of ThreadDeath (which may or may not be an unrecoverable > system > >> state depending on the thread), I think this makes sense, but I would > like > >> to hear from others if they have an opinion. I think it's better to ki= ll > >> the process and let your monitoring services detect process death (and > thus > >> restart) than possibly linger unresponsive for a while, are there > scenarios > >> that we're missing where this error can occur and you wouldn't want th= e > >> process killed? > >> > >> Thanks for your feedback, > >> > >> Camille > >> > > > > > > > > -- > > Best regards, > > Vitalii Tymchyshyn > --f46d0444ee2d58148e04bdc34d6f--