Mailing-List: contact user-help@curator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@curator.apache.org
Received-SPF: pass (athena.apache.org: domain of chuchao333@gmail.com
 designates 209.85.220.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <etPan.531dda10.3006c83e.193@Jordans-MacBook-Pro.local>
References: 
 <CAO17acbjC2PUwsg6AVJsV0Eg8QSHLSUrkSUC0rYvbJnfQuzh1Q@mail.gmail.com>
 <308E4CC1-6514-4CCC-BE0E-C3ABD0A7C452@jordanzimmerman.com>
 <CAO17acYG7_Ek-bZix8Ydgn+R0jPMaT7HkBdLjDTqc-4VJxr6Xg@mail.gmail.com>
 <1D572118-6678-42F9-A63E-32F16D025007@jordanzimmerman.com>
 <CAArVKB8ywdrXsUgiNPwZ1AoXudm8p3oCm-Ane9ENcz+f5Ob8gw@mail.gmail.com>
 <etPan.531dd1a0.22221a70.193@Jordans-MacBook-Pro.local>
 <CAArVKB-1XTNthxGRNXy7siq54myCpRteK9izYzynhHgDv+Xw9g@mail.gmail.com>
 <etPan.531dda10.3006c83e.193@Jordans-MacBook-Pro.local>
From: chao chu <chuchao333@gmail.com>
Date: Mon, 10 Mar 2014 23:41:04 +0800
Message-ID: 
 <CAArVKB_DRZzrZ4Ebea22JxmyfbBi+VcB72tBcnKyHm9c=4Sy4w@mail.gmail.com>
Subject: Re: Leader Latch recovery after suspended state
To: user@curator.apache.org
Content-Type: multipart/alternative; boundary=bcaec520f689a1be2204f4426f01

--bcaec520f689a1be2204f4426f01
Content-Type: text/plain; charset=ISO-8859-1

Hi JZ,

Sorry for being dense here, but my point is that: Suppose the leader latch
does NOT setLeadership(false) on receiving SUSPENDED, then what you
mentioned below won't happen, right?

*>>> Further, if zk2/zk3 are communicating then they are in Quorum.
Therefore, latch2 will declare that it is leader*

this way we can avoid the unnecessary leader switch due to the transient
latch1 <--> zk1 connection loss (i.e., the reboot of zk1 somehome) if the
latch1 can re-connect to the zk ensemble in time.


On Mon, Mar 10, 2014 at 11:28 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> In step 3, during the time that zk1 is partitioned latch1 cannot know if
> it's leader or not. Further, if zk2/zk3 are communicating then they are in
> Quorum. Therefore, latch2 will declare that it is leader. So, latch1 MUST
> go into a non-leader state. It doesn't matter if latch1 reconnects before
> the session expires. The issue is how it manages its state while it is in
> partition. To your point, latch1 could check to see if it's still leader
> when it reconnects but I'm not sure that there's a huge win here. Any
> clients of the latch will have to have handled the leadership loss during
> the partition.
>
> -JZ
>
>
> From: chao chu chuchao333@gmail.com
> Reply: user@curator.apache.org user@curator.apache.org
> Date: March 10, 2014 at 10:21:29 AM
> To: user user@curator.apache.org
>
> Subject:  Re: Leader Latch recovery after suspended state
>
>  Thanks for your reply first.
>
>  >>> Technically, this is already the case. When a RECONNECTED is
> received, LeaderLatch will attempt to regain leadership. The problem is
> that when there is a network partition there is no way to guarantee that
> you are still the leader. If there is Quorum in another segment of the
> cluster a new leader might be elected there.
>
>  I don't quite understand what you meant here. But let me explain the
> case I mentioned in detail:
>
> 1. let's say there is a 3-server zk ensemble {zk1, zk2, zk3}, and two
> participants for the leader electoin {leader latch1, leader latch2}
> 2. latch1 is the current leader, and connected to zk1
> 3. the connection latch1 <--> zk1 broken somehow
> 4. but sooner (within the session timeout), latch1 re-connected (maybe to
> zk2)
>
> I guess the problem is that in LeaderLatch, it setLeadership(false)
> (meaning that there must be leader changes) as long as a SUSPEND state (or
> actually a ZK DISCONNECTED event).
>
> while in this case, ideally (just my personal thinking), since latch1
> re-connected in time, its znode will still be there, no others (latch2 in
> this example) will observe any events due to this, and once re-connected,
> if it detected that its znode still there, checkLeadership should still set
> it as leader, and no leader changes (no 'isLeader' or 'notLeader' will be
> called) during this whole process. Thus, we can avoid the unnecessary
> leader switch (which, as I mentioned, can be very expensive in most of
> cases).
>
> does this make any sense to you? thanks
>
>
>
> On Mon, Mar 10, 2014 at 10:52 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>>  Please provide an implementation/fix and submit a pull request on
>> Github.
>>
>>   I also have a related question about not only re-use the znode, but
>> imho, It would be great that LeaderLatch can survive from teomprary
>> ConnectionLossException (i.e., due to transient network issue).
>>
>>  Technically, this is already the case. When a RECONNECTED is received,
>> LeaderLatch will attempt to regain leadership. The problem is that when
>> there is a network partition there is no way to guarantee that you are
>> still the leader. If there is Quorum in another segment of the cluster a
>> new leader might be elected there.
>>
>> -JZ
>>
>> From: chao chu chuchao333@gmail.com
>> Reply: user@curator.apache.org user@curator.apache.org
>> Date: March 10, 2014 at 9:39:50 AM
>> To: user@curator.incubator.apache.org user@curator.incubator.apache.org
>> Subject:  Re: Leader Latch recovery after suspended state
>>
>>   Hi,
>>
>> Just want to see if there is any progress on this?
>>
>> I also have a related question about not only re-use the znode, but imho,
>> It would be great that LeaderLatch can survive from teomprary
>> ConnectionLossException (i.e., due to transient network issue).
>>
>> I guess in most cases, the context switch due to leader re-election is
>> quite expensive, we might not want to do that just because of some
>> transient issue. if the current leader can re-connect within the session
>> timeout, it should still hold the leadership and no leader change would
>> happen during between. The similar rational like the differences between
>> ConnestionLossException (which is recoverable) and SessionExipredException
>> (which is not recoverable).
>>
>> what are your thoughts on this? Thanks a lot!
>>
>> Regards,
>>
>>
>> On Wed, Aug 21, 2013 at 2:05 AM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>>>  Yes, I was suggesting how to patch Curator.
>>>
>>>  On Aug 20, 2013, at 10:59 AM, Calvin Jia <jia.calvin@gmail.com> wrote:
>>>
>>>  Currently this is not supported in the Curator library, but the
>>> Curator library (specifically leader latch's reset method) is the
>>> correct/logical place to add this feature if I want it?
>>>
>>>
>>> On Tue, Aug 20, 2013 at 10:34 AM, Jordan Zimmerman <
>>> jordan@jordanzimmerman.com> wrote:
>>>
>>>>  On reset() it could check to see if its node still exists. It would
>>>> make the code a lot more complicated though.
>>>>
>>>> -JZ
>>>>
>>>>  On Aug 20, 2013, at 10:25 AM, Calvin Jia <jia.calvin@gmail.com> wrote:
>>>>
>>>>  A leader latch enters the suspended state after failing to receive a
>>>> response from the first ZK machine it heartbeats to (takes 2 thirds of the
>>>> timeout). For the last 1 third, it tries to contact another ZK machine. If
>>>> it is successful, it will enter the state reconnected.
>>>>
>>>> However, on reconnect, despite the fact the original node it created in
>>>> ZK is still there, it will create another ephemeral-sequential node (the
>>>> reset method is called). This means it will relinquish leadership, if there
>>>> is another machine with a latch in the same path.
>>>>
>>>> Is there any way to reconnect and reuse the original ZK node?
>>>>
>>>>  Thanks!
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> ChuChao
>>
>>
>
>
> --
> ChuChao
>
>


-- 
ChuChao

--bcaec520f689a1be2204f4426f01
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi JZ,<div><br></div><div>Sorry for being dense here, but =
my point is that: Suppose the leader latch does NOT setLeadership(false) on=
 receiving SUSPENDED, then what you mentioned below won&#39;t happen, right=
?</div>

<div><br></div><div><span style=3D"color:rgb(0,0,0);font-family:Helvetica,A=
rial;font-size:13px"><i>&gt;&gt;&gt; Further, if zk2/zk3 are communicating =
then they are in Quorum. Therefore, latch2 will declare that it is leader</=
i></span></div>

<div><font color=3D"#000000" face=3D"Helvetica, Arial"><i><br></i></font></=
div><div><font color=3D"#000000" face=3D"Helvetica, Arial">this way we can =
avoid the unnecessary leader switch due to the transient latch1 &lt;--&gt; =
zk1 connection loss (i.e., the reboot of zk1 somehome) if the latch1 can re=
-connect to the zk ensemble in time.<br>

</font><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Mon=
, Mar 10, 2014 at 11:28 PM, Jordan Zimmerman <span dir=3D"ltr">&lt;<a href=
=3D"mailto:jordan@jordanzimmerman.com" target=3D"_blank">jordan@jordanzimme=
rman.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex"><div style=3D"word-wrap:break-word"><div style=3D"font-fam=
ily:Helvetica,Arial;font-size:13px;color:rgb(0,0,0);margin:0px">

In step 3, during the time that zk1 is partitioned latch1 cannot know if it=
&rsquo;s leader or not. Further, if zk2/zk3 are communicating then they are=
 in Quorum. Therefore, latch2 will declare that it is leader. So, latch1 MU=
ST go into a non-leader state. It doesn&rsquo;t matter if latch1 reconnects=
 before the session expires. The issue is how it manages its state while it=
 is in partition. To your point, latch1 could check to see if it&rsquo;s st=
ill leader when it reconnects but I&rsquo;m not sure that there&rsquo;s a h=
uge win here. Any clients of the latch will have to have handled the leader=
ship loss during the partition.</div>

<div style=3D"font-family:Helvetica,Arial;font-size:13px;color:rgb(0,0,0);m=
argin:0px"><br></div><div style=3D"font-family:Helvetica,Arial;font-size:13=
px;color:rgb(0,0,0);margin:0px">-JZ</div> <div><div style=3D"font-family:he=
lvetica,arial;font-size:13px">

<br></div></div> <div style><div class=3D""><br>From:&nbsp;<span style>chao=
 chu</span> <a href=3D"mailto:chuchao333@gmail.com" target=3D"_blank">chuch=
ao333@gmail.com</a><br>Reply:&nbsp;<span style><a href=3D"mailto:user@curat=
or.apache.org" target=3D"_blank">user@curator.apache.org</a></span> <a href=
=3D"mailto:user@curator.apache.org" target=3D"_blank">user@curator.apache.o=
rg</a><br>

</div>Date:&nbsp;<span style>March 10, 2014 at 10:21:29 AM</span><br>To:&nb=
sp;<span style>user</span> <a href=3D"mailto:user@curator.apache.org" targe=
t=3D"_blank">user@curator.apache.org</a><div><div class=3D"h5"><br>Subject:=
&nbsp;<span style> Re: Leader Latch recovery after suspended state <br>

</span></div></div></div><div><div class=3D"h5"><br> <blockquote type=3D"ci=
te"><span><div><div></div><div>


<div dir=3D"ltr">
<div>Thanks for your reply first.</div>
<div><i><br></i></div>
&gt;&gt;&gt;&nbsp;<span style=3D"font-family:helvetica,arial;font-size:13px=
">Technically, this is
already the case. When a RECONNECTED is received, LeaderLatch will
attempt to regain leadership. The problem is that when there is a
network partition there is no way to guarantee that you are still
the leader. If there is Quorum in another segment of the cluster a
new leader might be elected there.</span>
<div><i><span style=3D"font-family:helvetica,arial;font-size:13px"><br></sp=
an></i></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">I
don&#39;t quite understand what you meant here. But let me explain the
case I mentioned in detail:</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px"><br></span>=
</div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">1.
let&#39;s say there is a 3-server zk ensemble {zk1, zk2, zk3}, and two
participants for the leader electoin {leader latch1, leader
latch2}</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">2.
latch1 is the current leader, and connected to zk1</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">3.
the connection latch1 &lt;--&gt; zk1 broken somehow</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">4.
but sooner (within the session timeout), latch1 re-connected (maybe
to zk2)</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px"><br></span>=
</div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">I
guess the problem is that in LeaderLatch, it setLeadership(false)
(meaning that there must be leader changes) as long as a SUSPEND
state (or actually a ZK DISCONNECTED event).</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px"><br></span>=
</div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">while
in this case, ideally (just my personal thinking), since latch1
re-connected in time, its znode will still be there, no others
(latch2 in this example) will observe any events due to this, and
once re-connected, if it detected that its znode still there,
checkLeadership should still set it as leader, and no leader
changes (no &#39;isLeader&#39; or &#39;notLeader&#39; will be called) durin=
g this
whole process. Thus, we can avoid the unnecessary leader switch
(which, as I mentioned, can be very expensive in most of
cases).</span></div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px"><br></span>=
</div>
<div><span style=3D"font-family:helvetica,arial;font-size:13px">does
this make any sense to you? thanks</span></div>
<div><i><span style=3D"font-family:helvetica,arial;font-size:13px"><br></sp=
an></i></div>
</div>
<div class=3D"gmail_extra"><br>
<br>
<div class=3D"gmail_quote">On Mon, Mar 10, 2014 at 10:52 PM, Jordan
Zimmerman <span dir=3D"ltr">&lt;<a href=3D"mailto:jordan@jordanzimmerman.co=
m" target=3D"_blank">jordan@jordanzimmerman.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">
<div style=3D"word-wrap:break-word">
<div style=3D"font-family:Helvetica,Arial;font-size:13px;color:rgb(0,0,0);m=
argin:0px">
Please provide an implementation/fix and submit a pull request on
Github.</div>
<div>
<div style=3D"font-family:helvetica,arial;font-size:13px">
<br></div>
<div style=3D"font-family:helvetica,arial;font-size:13px">
<div>
<blockquote type=3D"cite" style=3D"font-family:Helvetica,Arial">
<div dir=3D"ltr">I also have a related question about not only re-use
the znode, but imho, It would be great that LeaderLatch can survive
from teomprary ConnectionLossException (i.e., due to transient
network issue).&nbsp;</div>
</blockquote>
</div>
<div>
<div dir=3D"ltr">Technically, this is already the case. When a
RECONNECTED is received, LeaderLatch will attempt to regain
leadership. The problem is that when there is a network partition
there is no way to guarantee that you are still the leader. If
there is Quorum in another segment of the cluster a new leader
might be elected there.</div>
</div>
<div dir=3D"ltr"><br></div>
<div dir=3D"ltr">-JZ</div>
</div>
</div>
<div><br>
From:&nbsp;<span>chao chu</span> <a href=3D"mailto:chuchao333@gmail.com" ta=
rget=3D"_blank">chuchao333@gmail.com</a><br>
Reply:&nbsp;<span><a href=3D"mailto:user@curator.apache.org" target=3D"_bla=
nk">user@curator.apache.org</a></span> <a href=3D"mailto:user@curator.apach=
e.org" target=3D"_blank">user@curator.apache.org</a><br>
Date:&nbsp;<span>March 10, 2014 at 9:39:50 AM</span><br>
To:&nbsp;<span><a href=3D"mailto:user@curator.incubator.apache.org" target=
=3D"_blank">user@curator.incubator.apache.org</a></span> <a href=3D"mailto:=
user@curator.incubator.apache.org" target=3D"_blank">user@curator.incubator=
.apache.org</a><br>


Subject:&nbsp; <span>Re: Leader Latch recovery after
suspended state<br></span></div>
<div>
<div><br>
<blockquote type=3D"cite">
<div>
<div>
<div dir=3D"ltr"><span>Hi,</span>
<div><span><br></span></div>
<div><span>Just want to see if there is any progress on
this?</span></div>
<div><span><br></span></div>
<div><span>I also have a related question about not only re-use the
znode, but imho, It would be great that LeaderLatch can survive
from teomprary ConnectionLossException (i.e., due to transient
network issue).&nbsp;</span></div>
<div><span><br></span></div>
<div><span>I guess in most cases, the context switch due to leader
re-election is quite expensive, we might not want to do that just
because of some transient issue. if the current leader can
re-connect within the session timeout, it should still hold the
leadership and no leader change would happen during between. The
similar rational like the differences between
ConnestionLossException (which is recoverable) and
SessionExipredException (which is not recoverable).</span></div>
<div><span><br></span></div>
<div><span>what are your thoughts on this? Thanks a
lot!</span></div>
<div><span><br></span></div>
<div><span>Regards,</span></div>
<div>
<div class=3D"gmail_extra"><span><br>
<br></span>
<div class=3D"gmail_quote"><span>On Wed, Aug 21, 2013 at 2:05 AM,
Jordan Zimmerman <span dir=3D"ltr">&lt;<a href=3D"mailto:jordan@jordanzimme=
rman.com" target=3D"_blank">jordan@jordanzimmerman.com</a>&gt;</span>
wrote:<br></span>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">
<div style=3D"word-wrap:break-word">
<div>Yes, I was suggesting how to patch Curator.</div>
<div>
<div><br>
<div>
<div>On Aug 20, 2013, at 10:59 AM, Calvin Jia &lt;<a href=3D"mailto:jia.cal=
vin@gmail.com" target=3D"_blank">jia.calvin@gmail.com</a>&gt; wrote:</div>
<br>
<blockquote type=3D"cite">
<div dir=3D"ltr">Currently this is not supported in the Curator
library, but the Curator library (specifically leader latch&#39;s reset
method) is the correct/logical place to add this feature if I want
it?</div>
<div class=3D"gmail_extra"><br>
<br>
<div class=3D"gmail_quote">On Tue, Aug 20, 2013 at 10:34 AM, Jordan
Zimmerman <span dir=3D"ltr">&lt;<a href=3D"mailto:jordan@jordanzimmerman.co=
m" target=3D"_blank">jordan@jordanzimmerman.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">
<div style=3D"word-wrap:break-word">
<div>On reset() it could check to see if its node still exists. It
would make the code a lot more complicated though.</div>
<div><span><font color=3D"#888888"><br></font></span></div>
<div><span><font color=3D"#888888">-JZ</font></span></div>
<div>
<div><br>
<div>
<div>On Aug 20, 2013, at 10:25 AM, Calvin Jia &lt;<a href=3D"mailto:jia.cal=
vin@gmail.com" target=3D"_blank">jia.calvin@gmail.com</a>&gt; wrote:</div>
<br>
<blockquote type=3D"cite">
<div dir=3D"ltr"><span style=3D"font-family:arial,sans-serif;font-size:13px=
">A leader latch enters
the suspended state after failing to receive a response from the
first ZK machine it heartbeats to (takes 2 thirds of the timeout).
For the last 1 third, it tries to contact another ZK machine. If it
is successful, it will enter the state reconnected.</span>
<div style=3D"font-family:arial,sans-serif;font-size:13px">
<br></div>
<div style=3D"font-family:arial,sans-serif;font-size:13px">However,
on reconnect, despite the fact the original node it created in ZK
is still there, it will create another ephemeral-sequential node
(the reset method is called). This means it will relinquish
leadership, if there is another machine with a latch in the same
path.</div>
<div style=3D"font-family:arial,sans-serif;font-size:13px">
<br></div>
<div style=3D"font-family:arial,sans-serif;font-size:13px">Is there
any way to reconnect and reuse the original ZK node?</div>
<div style=3D"font-family:arial,sans-serif;font-size:13px">
<br></div>
<div style=3D"font-family:arial,sans-serif;font-size:13px">
Thanks!</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear=3D"all">
<div><br></div>
--<br>
ChuChao</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear=3D"all">
<div><br></div>
--<br>
ChuChao</div>


</div></div></span></blockquote></div></div></div></blockquote></div><br><b=
r clear=3D"all"><div><br></div>-- <br>ChuChao
</div></div></div>

--bcaec520f689a1be2204f4426f01--