curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Tschetter <eched...@gmail.com>
Subject Re: Real world usage of the recipe LeaderLatch
Date Tue, 02 Jul 2013 18:01:46 GMT
Yeah, when that happens on LeaderLatch, you lose leadership.  We use the
listener mechanism to take and lose leadership, so if there is a prolonged
lack of a connection, then the node just won't be the leader.

--Eric


On Tue, Jul 2, 2013 at 9:37 AM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> My general recommendation is to handle SUSPENDED and LOST in the same way.
> In the case of LeaderLatch, your code should exit whatever critical section
> it has for executing when leader.
>
> -JZ
>
> On Jul 2, 2013, at 12:29 PM, chao chu <chuchao333@gmail.com> wrote:
>
> sorry for the confusion. I knew that LeaderLatch will retry when it got
> connection 'LOST' event and one of the participant should be elected as a
> new leader when re-connected, however, I meant to ask *what if it stayed
> in 'LOST'* for quite a long time (for example, the zk ensemble become
> unavailable). I can understand that how to handle this situation should be
> very application specific, I was just trying to know what's your reaction
> for this in your code (want to see if there are any ideas we can borrow).
>
> not sure If i explained this clearly enough, thanks a lot for your reply
> though.
>
>
> On Tue, Jul 2, 2013 at 6:07 AM, Eric Tschetter <echeddar@gmail.com> wrote:
>
>> My understanding is that LeaderLatch already handles those cases for
>> you.  The unit tests in TestLeaderLatch definitely have something that
>> tries to test the LOST case.  If there's a case that is not handled, it'd
>> probably be best if you could provide a unit test that shows what's not
>> handled to help shape the conversation.
>>
>> --Eric
>>
>>
>> On Fri, Jun 28, 2013 at 8:03 AM, chao chu <chuchao333@gmail.com> wrote:
>>
>>> Hi Eric,
>>>
>>> Thanks for your sharing, by looking into your code, it's not very clear
>>> to me that how do you handle the 'SUSPEND' or 'LOST' events of LeaderLatch?
>>> Could you please shed some lights here? Thanks
>>>
>>>
>>> On Wed, Jun 26, 2013 at 11:52 PM, Eric Tschetter <echeddar@gmail.com>wrote:
>>>
>>>> ChuChao,
>>>>
>>>> We use it in the Druid project (http://www.github.com/metamx/druid/)
>>>>
>>>> You can see its use in the class com.metamx.druid.master.DruidMaster
>>>>
>>>> The class has a bunch of other stuff in it as well that is not specific
>>>> to the LeaderLatch, but you can just ignore that and see how it handles the
>>>> latch.
>>>>
>>>> --Eric
>>>>
>>>>
>>>> On Wednesday, June 26, 2013, chao chu wrote:
>>>>
>>>>> Thanks a lot for your reply. Could you please name a few open source
>>>>> projects that used LeaderLatch if you are aware of any? I'd like to take
a
>>>>> look at the code.
>>>>>
>>>>> btw, What about issues reported in the links I mentioned? are they
>>>>> actual bugs or just used in an unexpected way?
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 26, 2013 at 7:29 AM, Jordan Zimmerman <
>>>>> jordan@jordanzimmerman.com> wrote:
>>>>>
>>>>>> Curator is being used at major companies (i.e. Netflix, eBay, etc.).
>>>>>> Bugs are quickly fixed when reported. In particular, LeaderLatch
is widely
>>>>>> used.
>>>>>> -JZ
>>>>>>
>>>>>>
>>>>>> On Jun 25, 2013, at 11:03 AM, chao chu <chuchao333@gmail.com>
wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have been trying to use the LeaderLatch to implement Leader
>>>>>> Election in my project and had written some scripts to simulate the
>>>>>> situations when the zk ensemble become unstable due to network problems.
It
>>>>>> worked well and as expected so far.
>>>>>>
>>>>>> However, by digging into both zookeeper-users and curator-users
>>>>>> mailing lists, there are indeed some bugs/edge cases reported, like
>>>>>> LeaderLatch bug causing extra znodes appearing in Zookeeper<https://groups.google.com/forum/?fromgroups#!searchin/curator-users/LeaderLatch/curator-users/to8ViZp6p-E/xYbKbzqkZQYJ>
>>>>>>  and multiple participants thought they are leader<https://listserv.netflix.com/pipermail/curator-users/2012-October/000201.html>
which
>>>>>> worried me about the reliability of this.
>>>>>>
>>>>>> So, my question is that: are there any real world projects are using
>>>>>> this recipe which have proved the quality of it, or are there any
other
>>>>>> known edge cases or open issues?
>>>>>>
>>>>>>
>>>>>> Thanks & Regards,
>>>>>>
>>>>>> --
>>>>>> ChuChao
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ChuChao
>>>>>
>>>>
>>>
>>>
>>> --
>>> ChuChao
>>>
>>
>>
>
>
> --
> ChuChao
>
>
>

Mime
View raw message