Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@zookeeper.apache.org
Date: Thu, 16 Jul 2015 08:02:06 -0500
From: Jordan Zimmerman <jordan@jordanzimmerman.com>
To: Ivan Kelly <ivank@apache.org>, user@zookeeper.apache.org
Cc: "=?utf-8?Q?zookeeper-user=40hadoop.apache.org?="
 <zookeeper-user@hadoop.apache.org>
Message-ID: <etPan.55a7ab4e.3ac12af2.15a@Jordans-MacBook-Pro.local>
In-Reply-To: 
 <CAJdLeK3E8Vc_00HifFj11TgJd-qfxJkbX+tMVU_0Tg5rHpL4Fw@mail.gmail.com>
References: <1436982861611-7581277.post@n2.nabble.com>
 <CABWqe2YKsbBtUb93FEdd23ffOJWQc+wbbwhZbVEeME9b9z-=Rg@mail.gmail.com>
 <1436984221201-7581279.post@n2.nabble.com>
 <etPan.55a6a432.3ca70d1c.15a@Jordans-MacBook-Pro.local>
 <CABWqe2acet9LYR8W4=bqJCwxmo+ezQdcOkr7BLRCuybOXZUKyw@mail.gmail.com>
 <CABWqe2ZEFzx8O3i4zEEBXFECX73UT-7prGVv-TzaZTJAfpnYTQ@mail.gmail.com>
 <1436986588198-7581284.post@n2.nabble.com>
 <etPan.55a6ae21.254e917.15a@Jordans-MacBook-Pro.local>
 <1436987312991-7581287.post@n2.nabble.com>
 <etPan.55a6b0aa.66f6db1d.15a@Jordans-MacBook-Pro.local>
 <1436987748561-7581293.post@n2.nabble.com>
 <CAJdLeK0GGUM30OnDAT5OoN9fFT3=ux6kDnA_JU8MJDAs6Vn9+g@mail.gmail.com>
 <etPan.55a6bdbd.21af473b.15a@Jordans-MacBook-Pro.local>
 <CANcXBFMQUgheWXTzSoHLUgn_+WqC7f3i2KKTObczbnpT4X6ZEg@mail.gmail.com>
 <etPan.55a6c1dd.2aa6331d.15a@Jordans-MacBook-Pro.local>
 <CANcXBFP1kyU6mtFDgwahhmiH299BGt85v0uFCj1Zz7QPVLFpOw@mail.gmail.com>
 <etPan.55a6cd0e.298e9216.15a@Jordans-MacBook-Pro.local>
 <CANcXBFMfsi4+Ay7=CX4W+m23a4fHrwTE8+cnT29uGwO0A-P5Ng@mail.gmail.com>
 <CAJdLeK2ZLHJ-MKHHndvEhYKszaprXS8DqrpAwB_0w5Ex53jGzA@mail.gmail.com>
 <etPan.55a797aa.1bddefb8.15a@Jordans-MacBook-Pro.local>
 <CAJdLeK3E8Vc_00HifFj11TgJd-qfxJkbX+tMVU_0Tg5rHpL4Fw@mail.gmail.com>
Subject: Re: locking/leader election and dealing with session loss
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="55a7ab4e_18106528_15a"

--55a7ab4e_18106528_15a
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Of course there are a myriad theoretical possibilities. But I don=E2=80=99=
t believe any of what you=E2=80=99ve mentioned will happen in production.=
 =46or any reasonable case, you can be guaranteed that no two processes w=
ill consider themselves lock holders at the same instant in time.

-Jordan


On July 16, 2015 at 7:58:06 AM, Ivan Kelly (ivank=40apache.org) wrote:

On Thu, Jul 16, 2015 at 1:38 PM Jordan Zimmerman <jordan=40jordanzimmerma=
n.com> =20
wrote: =20

> Are you really seeing 30s gc pauses in production=3F If so, then of cou=
rse =20
> this could happen. However, if your application can tolerate a 30s paus=
e =20
> (which is hard to believe) then your session timeout is too low. The po=
int =20
> of the session timeout is to have enough coverage. So, if your app has =
30 =20
> seconds allowable pauses your session timeout would have to be much lon=
ger. =20
> =20
GC is just an example. There's other ways the same scenario could happen.=
 =20
The machine could swap out the process due to load. Someone could do =20
something stupid in the zookeeper event thread and the session expired =20
event is delayed. The state update could have hit the ip stack during =20
network partition, and the process then got wedged. The state update pack=
et =20
could have hit the network and been routed via the moon. The clock could =
=20
break. =20

If you are relying on a timer on the zk client to maintain a guarantee, =20
then you really aren't giving any guarantee because the zk client doesn't=
 =20
have control over all the things that could go wrong. =20

-Ivan =20

--55a7ab4e_18106528_15a--