Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 55551 invoked from network); 26 Jan 2010 04:03:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Jan 2010 04:03:12 -0000 Received: (qmail 54672 invoked by uid 500); 26 Jan 2010 04:03:11 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 54609 invoked by uid 500); 26 Jan 2010 04:03:11 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 54599 invoked by uid 99); 26 Jan 2010 04:03:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jan 2010 04:03:11 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.dunning@gmail.com designates 209.85.216.204 as permitted sender) Received: from [209.85.216.204] (HELO mail-px0-f204.google.com) (209.85.216.204) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jan 2010 04:03:04 +0000 Received: by pxi42 with SMTP id 42so6435843pxi.5 for ; Mon, 25 Jan 2010 20:02:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=yWF2W5xpX2qgh4P5iAB5Jzmy31llw+9ZP9Hgh+3rLKY=; b=YQE9t853zUGg8Xh7zjVt6HSJt5g/JZVTNqBJJ9zXZsFK+YhND0BS2CRNIaaXOrLcdb cVVblEHakS2lTigQgUpU4uzdxfmz4Fl/gl92eHaOuIzAIunjbGciBxlqjd5SWFpccbxt IHSPwpsFN225Z3+52MXFGABznmsU756IxoGM0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=I/Vmc1rI22CVMPlzb+5v11Iopxu4OgYnZP4lCrBhu72lMyhPOMSSJVQq6cKTE49qdo jexr1+/nESz8gY+EP3hRiqGcW64TeU6jo3D2tluKEHiy02lbUeq3x6v3a81dZ4RyGBXj ocHguKnXj01qoM7riFTrI2pxSIQUQh0AkFeQs= MIME-Version: 1.0 Received: by 10.114.4.17 with SMTP id 17mr5166083wad.35.1264478564112; Mon, 25 Jan 2010 20:02:44 -0800 (PST) In-Reply-To: References: <34fd060d1001230058ybeca0b1n79bdcd97e63be76@mail.gmail.com> <34fd060d1001242109k3f735a61y845b0a932bb43773@mail.gmail.com> From: Ted Dunning Date: Mon, 25 Jan 2010 20:02:24 -0800 Message-ID: Subject: Re: Using zookeeper to assign a bunch of long-running tasks to nodes (without unhandled tasks and double-handled tasks) To: zookeeper-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e649c184a3d96c047e095bdd --0016e649c184a3d96c047e095bdd Content-Type: text/plain; charset=UTF-8 On Mon, Jan 25, 2010 at 7:12 PM, Qing Yan wrote: > ... > 2) Lose connection with the (quorum of) ZK cluster, e.g. C3 as mentioned > before. if the situation continues will lead to CONNECTION_EXPIRE. > ... > About case 2), seems to me there is indeed a need for application to know > about this, per ZK documentation : > > When you disconnect from a server (for example, when the server fails), you > will not get any watches until the connection is reestablished. For this > reason session events are sent to all outstanding watch handlers. Use > session events to go into a safe mode: you will not be receiving events > while disconnected, so your process should act conservatively in that mode. > > http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperProgrammers.html > But from reading your post, it seems that case 2) will no longer be > reported > to application, hence the confusion. > I don't know if CONNECTION_LOSS would or would not be reported. As you point out, it may well be important to report it or to report it after a short period of loss without reconnection. I think that my preference would be that transactions during a connection loss block until the connection is restored or the session expires. That is probably a matter of taste. Some people would probably prefer that most operations return quickly even at the cost of considerably more complex semantics on their part. Sequential creation, in particular, should probably always block if the session is still conceivably alive. In any case, I think the documentation needs to be updated to reflect the > latest design/contract change. > Of course. --0016e649c184a3d96c047e095bdd--