Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8CCC6971D for ; Mon, 16 Apr 2012 16:54:31 +0000 (UTC) Received: (qmail 70673 invoked by uid 500); 16 Apr 2012 16:54:30 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 70630 invoked by uid 500); 16 Apr 2012 16:54:30 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 70558 invoked by uid 99); 16 Apr 2012 16:54:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2012 16:54:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ishaaq@gmail.com designates 209.85.212.174 as permitted sender) Received: from [209.85.212.174] (HELO mail-wi0-f174.google.com) (209.85.212.174) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2012 16:54:21 +0000 Received: by wibhr17 with SMTP id hr17so7862301wib.15 for ; Mon, 16 Apr 2012 09:54:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=gfqz9f+yCO6mTuhgLerVpTJqAGMBxQCG79ZG4l/ssMM=; b=P5l6zEKby61LziusgpMGNLMZFtsS5mLCHwtpWcSCdgoJesabzl47ZNiW7MwrhPU43P CiREZCtvmA1g6X5HGUnOaIHIig9vJCNBeaymrusRiFQRAd5HoHhrBzZJ+usRVASUTqpf LUWVIbSC+XGNRmynk/B8NYoK/dbEX8o62qVN1E9mBhYJk91Cc/fMEeAXNNi4kymf0yij esml8225vyCkyqDfi+Pi39kfx8BvIPi4bW71/myTML/xERt3xoYxIzlWCVmVnbCEJ1Fs tTdbcVvh+O0QiFF7CrusIcXVhOR250Rx8EcujA1D0rzvQw6PckUPsaspkIK7PJZ6rMGY mLjw== MIME-Version: 1.0 Received: by 10.180.100.230 with SMTP id fb6mr19095299wib.3.1334595172976; Mon, 16 Apr 2012 09:52:52 -0700 (PDT) Received: by 10.223.161.3 with HTTP; Mon, 16 Apr 2012 09:52:52 -0700 (PDT) In-Reply-To: References: Date: Tue, 17 Apr 2012 02:52:52 +1000 Message-ID: Subject: Re: Input on a change From: Ishaaq Chandy To: user@zookeeper.apache.org Cc: zookeeper-dev@hadoop.apache.org, zookeeper-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d041824ee3448ff04bdcea77e --f46d041824ee3448ff04bdcea77e Content-Type: text/plain; charset=KOI8-U Content-Transfer-Encoding: quoted-printable Fair enough, if you think it is an anti-pattern then perhaps you can suggest an alternative that allows one to write tests that are (in descending order of priority): 1. Quick and easy to setup and teardown 2. Repeatably testable and debuggable in an IDE without having to resort to the external build tool 3. Testable in parallel, i.e. having more than one build running on a CI server, so need some way to avoid port clashes 4. Cross-platform - i.e. run the exact same build sequence on multiple OSes= . Ishaaq On 16 April 2012 22:55, Camille Fournier wrote: > I believe that this change is inspired by someone that runs zk embedded. > Personally I'm not moved by the testing argument, embedding the server fo= r > testing is a bit of an anti pattern in my mind. > > From my phone > On Apr 15, 2012 11:20 PM, "Ishaaq Chandy" wrote: > > > I'd go so far as to say that even the server-code should avoid > System.exit. > > Just because it is "meant" to be a standalone system doesn't mean that > code > > that makes it impossible to embed it should be encouraged. > > > > For e.g, we embed a local version of ZK to be used inside our unit test= s. > > This makes it much easier for us to control ZK to coincide with test > > expectations as well as making for much faster build times. It would be= a > > shame if the embedded ZK started killing the JVM. > > > > Ishaaq > > > > On 16 April 2012 04:28, Camille Fournier wrote: > > > > > This is a good point. > > > I think this change should be fine for the server portion of the code= , > > > since it's designed to be run as a standalone system. But for the > > > client connection to also call system.exit on such an error is > > > overreaching for all the reasons listed below. > > > > > > C > > > > > > 2012/4/15 =F7=A6=D4=C1=CC=A6=CA =F4=C9=CD=DE=C9=DB=C9=CE : > > > > I really would not like for any library to perform a System.exit > call. > > > This > > > > would make huge program exit out of sudden (think about j2ee, you m= ay > > be > > > > bitten by security manager). Note that there are more or less safe > > > errors, > > > > like StackOverflowError. > > > > Also System.exit make testing nightmare. E.g. maven2 silently skips > any > > > > tests after the one that calls System.exit. And everything's green. > > > > As for me good options are: > > > > 1) Call user-provided uncaught exception handler. Use the one from > the > > > > thread that created the connection if one is not specified explicit= y. > > > > 1) Stop everything, notifying user with a global watcher. If it's > > > possible, > > > > clean any static state (e.g. restart threads) and allow to restart > > > > connection. > > > > In any case, call user code. Good system already know how to react > (it > > > may > > > > want to send email to admin), allow it to perform well. > > > > > > > > Best regards, Vitalii Tymchyshyn. > > > > > > > > 2012/4/13 Camille Fournier > > > > > > > >> Hi everyone, > > > >> > > > >> I'm trying to evaluate a patch that Jeremy Stribling has submitted= , > > and > > > I'd > > > >> like some feedback from the user base on it. > > > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442 > > > >> > > > >> The current behavior of ZK when we get an uncaught exception is to > log > > > it > > > >> and try to move on. This is arguably not the right thing to do, an= d > > will > > > >> possibly cause ZK to limp along with a bad VM (say, in an OOM stat= e) > > for > > > >> longer than it should. > > > >> The patch proposes that when we get an instance of java.lang.Error= , > we > > > >> should do a system.exit to fast-fail the process. With the possibl= e > > > >> exception of ThreadDeath (which may or may not be an unrecoverable > > > system > > > >> state depending on the thread), I think this makes sense, but I > would > > > like > > > >> to hear from others if they have an opinion. I think it's better t= o > > kill > > > >> the process and let your monitoring services detect process death > (and > > > thus > > > >> restart) than possibly linger unresponsive for a while, are there > > > scenarios > > > >> that we're missing where this error can occur and you wouldn't wan= t > > the > > > >> process killed? > > > >> > > > >> Thanks for your feedback, > > > >> > > > >> Camille > > > >> > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > Vitalii Tymchyshyn > > > > > > --f46d041824ee3448ff04bdcea77e--