Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0250218B08 for ; Tue, 4 Aug 2015 21:15:06 +0000 (UTC) Received: (qmail 36352 invoked by uid 500); 4 Aug 2015 21:15:05 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 36301 invoked by uid 500); 4 Aug 2015 21:15:05 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 36290 invoked by uid 99); 4 Aug 2015 21:15:05 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Aug 2015 21:15:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B08BADA6C1 for ; Tue, 4 Aug 2015 21:15:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.02 X-Spam-Level: X-Spam-Status: No, score=-0.02 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id KfMlo0viGxXF for ; Tue, 4 Aug 2015 21:15:03 +0000 (UTC) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 0985624E0F for ; Tue, 4 Aug 2015 21:15:03 +0000 (UTC) Received: by wibxm9 with SMTP id xm9so40366083wib.1 for ; Tue, 04 Aug 2015 14:15:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=DLF/W95aNbACcqn/Bdfi6GusmbAz9ZzxhUCQrzZjslM=; b=Sdb5UTN4NTPWkx0iZ0amzhPRxQpOkBgt7RJRhoQLS8ARB1VRypMWIdZN5Cq/2hsD0b GSOtfvmjqukmOr+qzYBkF1hYK5brgbjaxgpc5ozC/EZMrGf5ddwlRHIlw5Lk/BFzryKi /rvxkP24wMGHCIg+T2WLK75sahxqvKoQcZvFMnakD/rjeoU9BP9g5+4WlTYgyNQe5M1K psaBC7kmYL1MtqINzLDNPyCtntZs2gEZYi3ED19VAx70R+jSERzSx+BXUjYVzcTNbhMm tn0oytEfaGCR2oU7V8PYv85K6kJ6X7q52I34Ha/kFiFsZ21ErgEs1lDGfJs7ueM2A1MK 737A== X-Received: by 10.180.104.129 with SMTP id ge1mr12227070wib.84.1438722901539; Tue, 04 Aug 2015 14:15:01 -0700 (PDT) Received: from [192.168.1.64] (host86-132-169-13.range86-132.btcentralplus.com. [86.132.169.13]) by smtp.gmail.com with ESMTPSA id n6sm4259773wix.1.2015.08.04.14.15.00 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 04 Aug 2015 14:15:00 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2098\)) Subject: Re: Doubts about libzookeeper From: Flavio Junqueira In-Reply-To: Date: Tue, 4 Aug 2015 22:15:01 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <63702B46-DFBC-4FA3-BD13-E7C8ED3FD4E7@apache.org> References: <2015080414393275398766@baidu.com> To: user@zookeeper.apache.org X-Mailer: Apple Mail (2.2098) If the client isn't sure that the delete has gone through, just do it = again once reconnected (to server 2 in the scenario described). Whatever = response you get for the delete should determine what you need to do.=20 -Flavio > On 04 Aug 2015, at 22:11, Alexander Shraer wrote: >=20 > maybe 1 or 2 synctime, is enough given what you said about syncs - = after 1 > synctime > we know that either server1 disconnected (and will have to bootstrap = its > state from the leader > if it ever reconnects) or the request got to the leader. But since = synctime > may not be measured > exactly from our request submission it maybe that 2 synctime are = needed. > Would need to look > deeper into pings and synctime to tell for sure. >=20 > On Tue, Aug 4, 2015 at 2:05 PM, Camille Fournier = wrote: >=20 >> That's true. I spent some time trying to think about when and how = that >> would be possible, and didn't get very far. We have guarantees about = how >> far out of sync a quorum member can be before it's booted, so I would = think >> that there's some way to timebound this potentially to prevent it, a = la >> your suggestion about 3X synctime. >>=20 >> C >>=20 >>=20 >> On Tue, Aug 4, 2015 at 4:58 PM, Alexander Shraer >> wrote: >>=20 >>> Yes, I checked and you're right. It gets queued at the leader until = all >>> previously proposed requests at the leader >>> are committed. But still if the request is only on its way between >> server 1 >>> and the leader sync won't immediately help, right ? >>>=20 >>>=20 >>> On Tue, Aug 4, 2015 at 11:39 AM, Camille Fournier = >>> wrote: >>>=20 >>>> I thought that sync forced a flush of the queued events on a quorum >>> member >>>> before completing/got it in the path of events from the leader, so = that >>> it >>>> won't return until all of the pending leader events before it have = been >>>> seen by this quorum member. Is that not correct? >>>>=20 >>>> On Tue, Aug 4, 2015 at 2:20 PM, Alexander Shraer = >>>> wrote: >>>>=20 >>>>> It seems that since the delete may be in-flight (between server 1 = and >>>>> leader, or still being proposed by the leader) >>>>> when the client connects to server 2, doing a sync right a way may >> not >>>> help >>>>> since the operation hasn't been committed yet. Perhaps the client >>> should >>>>> wait some multiple of synclimit time (3x ?) before invoking the = sync >> to >>>>> allow the delete to commit or disappear for sure. This is all = related >>> to >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-22, which is still >>> open >>>>> unfortunately... >>>>>=20 >>>>> On Tue, Aug 4, 2015 at 10:15 AM, Camille Fournier < >> camille@apache.org> >>>>> wrote: >>>>>=20 >>>>>> True, I'm not sure when the xid increments. If that is the case, >> you >>>> can >>>>>> force a sync before the read of the path, to prevent reading = stale >>>> data. >>>>> So >>>>>> that would be the solve for that edge case although it's an >> expensive >>>>>> solve. >>>>>>=20 >>>>>> C >>>>>>=20 >>>>>> On Tue, Aug 4, 2015 at 12:52 PM, Alexander Shraer < >> shralex@gmail.com >>>>=20 >>>>>> wrote: >>>>>>=20 >>>>>>> Hi Camille, >>>>>>>=20 >>>>>>> if the client received a response for the delete then sure it >>>> shouldn't >>>>>> be >>>>>>> able to connect >>>>>>> to servers that didn't see it. But if it disconnected before >> seeing >>>> the >>>>>>> response the example seems possible to me. >>>>>>> I haven't checked the code to see when exactly the transaction >>> number >>>>> is >>>>>>> incremented at >>>>>>> the client, so I may be wrong, but suppose for example that >>>> zkserver-1 >>>>>>> crashes before >>>>>>> sending the delete request to the leader. Then, the request is >> gone >>>>>>> forever. If you don't let the client >>>>>>> connect to another server that hasn't seen the delete, the = client >>>> will >>>>>>> never be able to connect. >>>>>>> So it seems quite possible that it connects, then the request is >>>>> executed >>>>>>> (if zkserver-1 hasn't crashed >>>>>>> after all) and the znode disappears. >>>>>>>=20 >>>>>>> Alex >>>>>>>=20 >>>>>>>=20 >>>>>>> On Tue, Aug 4, 2015 at 8:33 AM, Camille Fournier < >>> camille@apache.org >>>>>=20 >>>>>>> wrote: >>>>>>>=20 >>>>>>>> ZooKeeper provides a session-coherent single system image >>>> guarantee. >>>>>> Any >>>>>>>> request from the same session will see the results of all of >> its >>>>>> writes, >>>>>>>> regardless of which server it connects to. See: >>>>>>>>=20 >>>>>>>>=20 >>>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >> = http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkGuar= antees >>>>>>>>=20 >>>>>>>> So, if your session deletes, and the delete is successfully >>>> processed >>>>>> by >>>>>>>> the quorum, you will not see the path that you have deleted no >>>> matter >>>>>>> what >>>>>>>> server your session connects to. I believe in practice that >> this >>>>> means >>>>>>> that >>>>>>>> the ZK servers that might be behind your session (say server 2 >> is >>>>>> lagging >>>>>>>> behind a few commits) will refuse to allow your session to >>> connect >>>> to >>>>>> it, >>>>>>>> so that you will not see stale data. >>>>>>>>=20 >>>>>>>> This means that the example Lokesh gave: >>>>>>>>=20 >>>>>>>> "1. Quorum leader has forwarded request to zkserver-2 for >> "delete >>>>>> /path". >>>>>>>> 2. If your client connects to "zkserver-2" after step 1 is >>> executed >>>>>> (get >>>>>>>> /path). Then your "/path" will not be available. >>>>>>>> 3. If your client connects to "zkserver-2" before step1 is >>> executed >>>>>> (get >>>>>>>> /path) then your "/path" would be available and after some time >>>> your >>>>>> path >>>>>>>> would not be available (after zkserver-2 is synched with the >>>> leader)" >>>>>>>>=20 >>>>>>>> Cannot happen, so long as you are in the same session. >>>>>>>>=20 >>>>>>>> C >>>>>>>>=20 >>>>>>>> On Tue, Aug 4, 2015 at 6:49 AM, Lokesh Shrivastava < >>>>>>>> lokesh.shrivastava@gmail.com> wrote: >>>>>>>>=20 >>>>>>>>> I think it depends on whether your request reaches zkserver-1 >>> and >>>>>>> whether >>>>>>>>> it is able to send the request to quorum leader. Considering >>> that >>>>>>> "delete >>>>>>>>> /path" request has reached the quorum leader then following >> may >>>>>> happen >>>>>>>>>=20 >>>>>>>>> 1. Quorum leader has forwarded request to zkserver-2 for >>> "delete >>>>>>> /path". >>>>>>>>> 2. If your client connects to "zkserver-2" after step 1 is >>>> executed >>>>>>> (get >>>>>>>>> /path). Then your "/path" will not be available. >>>>>>>>> 3. If your client connects to "zkserver-2" before step1 is >>>> executed >>>>>>> (get >>>>>>>>> /path) then your "/path" would be available and after some >> time >>>>> your >>>>>>> path >>>>>>>>> would not be available (after zkserver-2 is synched with the >>>>> leader) >>>>>>>>>=20 >>>>>>>>> Others can correct me if this is not how it works. >>>>>>>>>=20 >>>>>>>>> Thanks. >>>>>>>>> Lokesh >>>>>>>>>=20 >>>>>>>>> On 4 August 2015 at 12:09, liangdong01@baidu.com < >>>>>>> liangdong01@baidu.com> >>>>>>>>> wrote: >>>>>>>>>=20 >>>>>>>>>> Hi, >>>>>>>>>> I'm thinking about a program desgin with libzookeeper, >>>> here >>>>> is >>>>>>> my >>>>>>>>>> doubts: >>>>>>>>>>=20 >>>>>>>>>> 1) first, I connnect to zkserver-1, and there exists >> the >>>> path >>>>>>>>> "/path". >>>>>>>>>> 2) I sends "delete /path", the request reaches(may >> not, i >>>>> don't >>>>>>>> know >>>>>>>>>> about that) zkserver-1 and dont't know whether this >> effected, >>>> and >>>>>>> then >>>>>>>>> lost >>>>>>>>>> connection before response returns. >>>>>>>>>> 3) reconnect the same session to zkserver-2, and I >> sends >>>>> "get >>>>>>>>> /path". >>>>>>>>>>=20 >>>>>>>>>> which one will the "get /path" return possibly : >>>>>>>>>> 1, "not exists" >>>>>>>>>> 2, "exists" and "always exists" >>>>>>>>>> 3, "exists" and "not exists" afterwards >>>>>>>>>>=20 >>>>>>>>>> my biggist problem is wether the 3) will occur or not, >>>>> thanks! >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> liangdong01@baidu.com >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>=20 >>>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>=20