Mailing-List: contact user-help@curator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@curator.apache.org
MIME-Version: 1.0
Reply-To: cammckenzie@apache.org
In-Reply-To: 
 <CAHA5whqTDU=2_kNARZdeQ8ze6TFRONTS0qda1gXdxex6YzVbXw@mail.gmail.com>
References: 
 <CAHA5whrOrDg8s431TXUEiYHqR20AW4Vi3+tRLwESGqCLspFTXQ@mail.gmail.com>
	<CANykduYHK-EbsPveNPvRLegKrTa1+-Xs-cAjQVQhQF2roiW2AQ@mail.gmail.com>
	<CAHA5whqTDU=2_kNARZdeQ8ze6TFRONTS0qda1gXdxex6YzVbXw@mail.gmail.com>
Date: Thu, 19 Nov 2015 10:23:35 +1100
Message-ID: 
 <CANykduY2jAAYbQ4Si59EY2avVYbdnd4u-ngoB5HDqu6hAsGKyw@mail.gmail.com>
Subject: Re: Query on SESSION_LOST (3.0.0)
From: Cameron McKenzie <cammckenzie@apache.org>
To: Vikrant Singh <vikrant.subscribe@gmail.com>
Cc: user@curator.apache.org
Content-Type: multipart/alternative; boundary=94eb2c0b73166b480f0524d8eeb6

--94eb2c0b73166b480f0524d8eeb6
Content-Type: text/plain; charset=UTF-8

Not necessarily false alarms, just that the LOST event didn't necessarily
mean session loss, just that curator was giving up.

With 3.0.0 the LOST event will occur when Curator is explicitly told that a
session has expired by Zookeeper, or if no connection to Zookeeper is
available, Curator will publish a LOST event when it thinks that the
session has been lost. This is based on a timer and the negotiated session
timeout with ZooKeeper.


On Thu, Nov 19, 2015 at 10:13 AM, Vikrant Singh <vikrant.subscribe@gmail.com
> wrote:

> Thanks a lot for reply. So if I am understanding it correct, there were
> false alarms (or mistaken connection lost) . With 3.0.0 connection_lost
> events will happen only when there is true session lost.
>
> On Wed, Nov 18, 2015 at 1:16 PM, Cameron McKenzie <cammckenzie@apache.org>
> wrote:
>
>> Hey Vikrant,
>> The issue was that the LOST event was being published by Curator when it
>> gave up trying to reconnect to Zookeeper after connection loss, whereas
>> most people were interpreting it to mean that the session was lost.
>>
>> So, the change in CURATOR-3.0 is that the LOST event will be published
>> when the session has either expired and Curator is explicitly told this by
>> Zookeeper (implying that a connection is present), or when Curator has been
>> disconnected from Zookeeper for long enough for the session to have expired
>> on the server (this will occur when no connection to Zookeeper is present).
>>
>> So, I'm not sure how it will help your case. It is just a more reliable
>> way of knowing that the session is gone and all related ephemeral state on
>> the Zookeeper server will also be gone.
>>
>> Note that it's also possible to tell Curator to use the legacy way of
>> interpreting the LOST event.
>> cheers
>>
>>
>> On Thu, Nov 19, 2015 at 8:09 AM, Vikrant Singh <
>> vikrant.subscribe@gmail.com> wrote:
>>
>>> Hello All,
>>> I need some guidance on understanding how to a fix done in latest
>>> release 3.0.0 . I am talking about following fix -
>>> https://issues.apache.org/jira/browse/CURATOR-247 .
>>>
>>> In my project we create some ephemeral nodes and monitor a cluster
>>> through a tree cache . Framework for treecache and ephemeral node is
>>> created using ExponentialBackoffRetry with retry interval of 1 sec and
>>> retry count of 29 (which is MAX_RETRIES_LIMIT ) .  We do kill the
>>> process moment  we get TreeCacheEvent.Type.CONNECTION_LOST event .
>>>
>>> As process restart is really expensive, I want to understand how I can
>>> leverage from this fix.
>>>
>>> Please help me in understanding what is the issue and how it may affect
>>> a setup like ours. We are still not on 3.0.0.
>>>
>>> Thanks,
>>> Vikrant
>>>
>>>
>>>
>>
>

--94eb2c0b73166b480f0524d8eeb6
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Not necessarily false alarms, just that the LOST event did=
n&#39;t necessarily mean session loss, just that curator was giving up.<div=
><br></div><div>With 3.0.0 the LOST event will occur when Curator is explic=
itly told that a session has expired by Zookeeper, or if no connection to Z=
ookeeper is available, Curator will publish a LOST event when it thinks tha=
t the session has been lost. This is based on a timer and the negotiated se=
ssion timeout with ZooKeeper.</div><div><br></div><div><br></div></div><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, Nov 19, 2015 =
at 10:13 AM, Vikrant Singh <span dir=3D"ltr">&lt;<a href=3D"mailto:vikrant.=
subscribe@gmail.com" target=3D"_blank">vikrant.subscribe@gmail.com</a>&gt;<=
/span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Thanks a l=
ot for reply. So if I am understanding it correct, there were false alarms =
(or mistaken connection lost) . With 3.0.0 connection_lost events will happ=
en only when there is true session lost.=C2=A0</div><div class=3D"HOEnZb"><=
div class=3D"h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">=
On Wed, Nov 18, 2015 at 1:16 PM, Cameron McKenzie <span dir=3D"ltr">&lt;<a =
href=3D"mailto:cammckenzie@apache.org" target=3D"_blank">cammckenzie@apache=
.org</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"lt=
r">Hey Vikrant,<div>The issue was that the LOST event was being published b=
y Curator when it gave up trying to reconnect to Zookeeper after connection=
 loss, whereas most people were interpreting it to mean that the session wa=
s lost.</div><div><br></div><div>So, the change in CURATOR-3.0 is that the =
LOST event will be published when the session has either expired and Curato=
r is explicitly told this by Zookeeper (implying that a connection is prese=
nt), or when Curator has been disconnected from Zookeeper for long enough f=
or the session to have expired on the server (this will occur when no conne=
ction to Zookeeper is present).</div><div><br></div><div>So, I&#39;m not su=
re how it will help your case. It is just a more reliable way of knowing th=
at the session is gone and all related ephemeral state on the Zookeeper ser=
ver will also be gone.</div><div><br></div><div>Note that it&#39;s also pos=
sible to tell Curator to use the legacy way of interpreting the LOST event.=
</div><div>cheers</div><div><br></div></div><div><div><div class=3D"gmail_e=
xtra"><br><div class=3D"gmail_quote">On Thu, Nov 19, 2015 at 8:09 AM, Vikra=
nt Singh <span dir=3D"ltr">&lt;<a href=3D"mailto:vikrant.subscribe@gmail.co=
m" target=3D"_blank">vikrant.subscribe@gmail.com</a>&gt;</span> wrote:<br><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hello All,<div>I need some g=
uidance on understanding how to a fix done in latest release 3.0.0 . I am t=
alking about following fix - <a href=3D"https://issues.apache.org/jira/brow=
se/CURATOR-247" target=3D"_blank">https://issues.apache.org/jira/browse/CUR=
ATOR-247</a> .</div><div><br></div><div>In my project we create some epheme=
ral nodes and monitor a cluster through a tree cache .=C2=A0Framework for t=
reecache and ephemeral node is created using=C2=A0ExponentialBackoffRetry w=
ith retry interval of 1 sec and retry count of 29 (which is=C2=A0<font colo=
r=3D"#0086b3" face=3D"Consolas, Liberation Mono, Menlo, Courier, monospace"=
><span style=3D"font-size:12px;line-height:16.8px;white-space:pre-wrap">MAX=
_RETRIES_LIMIT )</span></font>=C2=A0.=C2=A0 We do kill the process moment =
=C2=A0we get=C2=A0TreeCacheEvent.Type.CONNECTION_LOST event .=C2=A0</div><d=
iv><font color=3D"#0086b3" face=3D"Consolas, Liberation Mono, Menlo, Courie=
r, monospace"><span style=3D"font-size:12px;line-height:16.8px;white-space:=
pre-wrap"><br></span></font></div><div>As process restart is really expensi=
ve, I want to understand how I can leverage from this fix.</div><div><span =
style=3D"color:rgb(0,134,179);font-family:Consolas,&#39;Liberation Mono&#39=
;,Menlo,Courier,monospace;font-size:12px;line-height:16.8px;white-space:pre=
-wrap"><br></span></div><div><span style=3D"color:rgb(0,134,179);font-famil=
y:Consolas,&#39;Liberation Mono&#39;,Menlo,Courier,monospace;font-size:12px=
;line-height:16.8px;white-space:pre-wrap">Please help me in understanding w=
hat is the issue and how it may affect a setup like ours. </span>We are sti=
ll not on 3.0.0.</div><div><br></div><div>Thanks,</div><div>Vikrant<br><br>=
<br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--94eb2c0b73166b480f0524d8eeb6--