curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chao chu <chuchao...@gmail.com>
Subject Re: Leader Latch recovery after suspended state
Date Mon, 10 Mar 2014 14:43:39 GMT
sorry for the spam, to really add user@curator.apache.org this time.


On Mon, Mar 10, 2014 at 10:42 PM, chao chu <chuchao333@gmail.com> wrote:

> + user@curator.apache.org
>
> The original mail thread was updated in the incubator age :)
>
>
> On Mon, Mar 10, 2014 at 10:38 PM, chao chu <chuchao333@gmail.com> wrote:
>
>> Hi,
>>
>> Just want to see if there is any progress on this?
>>
>> I also have a related question about not only re-use the znode, but imho,
>> It would be great that LeaderLatch can survive from teomprary
>> ConnectionLossException (i.e., due to transient network issue).
>>
>> I guess in most cases, the context switch due to leader re-election is
>> quite expensive, we might not want to do that just because of some
>> transient issue. if the current leader can re-connect within the session
>> timeout, it should still hold the leadership and no leader change would
>> happen during between. The similar rational like the differences between
>> ConnestionLossException (which is recoverable) and SessionExipredException
>> (which is not recoverable).
>>
>> what are your thoughts on this? Thanks a lot!
>>
>> Regards,
>>
>>
>> On Wed, Aug 21, 2013 at 2:05 AM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>>> Yes, I was suggesting how to patch Curator.
>>>
>>> On Aug 20, 2013, at 10:59 AM, Calvin Jia <jia.calvin@gmail.com> wrote:
>>>
>>> Currently this is not supported in the Curator library, but the Curator
>>> library (specifically leader latch's reset method) is the correct/logical
>>> place to add this feature if I want it?
>>>
>>>
>>> On Tue, Aug 20, 2013 at 10:34 AM, Jordan Zimmerman <
>>> jordan@jordanzimmerman.com> wrote:
>>>
>>>> On reset() it could check to see if its node still exists. It would
>>>> make the code a lot more complicated though.
>>>>
>>>> -JZ
>>>>
>>>> On Aug 20, 2013, at 10:25 AM, Calvin Jia <jia.calvin@gmail.com> wrote:
>>>>
>>>> A leader latch enters the suspended state after failing to receive a
>>>> response from the first ZK machine it heartbeats to (takes 2 thirds of the
>>>> timeout). For the last 1 third, it tries to contact another ZK machine. If
>>>> it is successful, it will enter the state reconnected.
>>>>
>>>> However, on reconnect, despite the fact the original node it created in
>>>> ZK is still there, it will create another ephemeral-sequential node (the
>>>> reset method is called). This means it will relinquish leadership, if there
>>>> is another machine with a latch in the same path.
>>>>
>>>> Is there any way to reconnect and reuse the original ZK node?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> ChuChao
>>
>
>
>
> --
> ChuChao
>



-- 
ChuChao

Mime
View raw message