lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From svante karlsson <s...@csi.se>
Subject Re: Help needed to understand zookeeper in solrcloud
Date Thu, 05 Mar 2015 08:45:15 GMT
The network will "only" split if you get errors on your network hardware.
(or fiddle with iptables) Let's say you placed your zookeepers in separate
racks and someone pulls network cable between them - that will leave you
with 5 working servers but they can't reach each other. This is "split
brain scenario".

>Are they guaranteed to split 4/0
Yes. A node failure will not partition the network.

> any odd number - it could be 21 even
Since all write a synchronous you don't want to use a too large number of
zookeepers since that would slow down the cluster. Use a reasonable number
to reach your SLA. (3 or 5 are common choices)

>and from a single failure you drop to an even number - then there is the
danger of NOT getting quorum.
No, se above.

BUT, if you first lose most of  your nodes due to a network partition and
then lose another due to node failure - then you are out of quorum.


/svante



2015-03-05 9:29 GMT+01:00 Julian Perry <jules@limitless.co.uk>:

>
> I start out with 5 zk's.  All good.
>
> One zk fails - I'm left with four.  Are they guaranteed
> to split 4/0 or 3/1 - because if they split 2/2 I'm screwed,
> right?
>
> Surely to start with 5 zk's (or in fact any odd number - it
> could be 21 even), and from a single failure you drop to an
> even number - then there is the danger of NOT getting quorum.
>
> So ... I can only assume that there is a mechanism in place
> inside zk to guarantee this cannot happen, right?
>
> --
> Cheers
> Jules.
>
>
>
> On 05/03/2015 06:47, svante karlsson wrote:
>
>> Yes, as long as it is three (the majority of 5) or more.
>>
>> This is why there is no point of having a 4 node cluster. This would also
>> require 3 nodes for majority thus giving it the fault tolerance of a 3
>> node
>> cluster but slower and more expensive.
>>
>>
>>
>> 2015-03-05 7:41 GMT+01:00 Aman Tandon <amantandon.10@gmail.com>:
>>
>>  Thanks svante.
>>>
>>> What if in the cluster of 5 zookeeper only 1 zookeeper goes down, will
>>> zookeeper election can occur with 4 / even number of zookeepers alive?
>>>
>>> With Regards
>>> Aman Tandon
>>>
>>> On Tue, Mar 3, 2015 at 6:35 PM, svante karlsson <saka@csi.se> wrote:
>>>
>>>  synchronous update of state and a requirement of more than half the
>>>> zookeepers alive (and in sync) this makes it impossible to have a "split
>>>> brain" situation ie when you partition a network and get let's say 3
>>>>
>>> alive
>>>
>>>> on one side and 2 on the other.
>>>>
>>>> In this case the 2 node networks stops serving request since it's not in
>>>> majority.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2015-03-03 13:15 GMT+01:00 Aman Tandon <amantandon.10@gmail.com>:
>>>>
>>>>  But how they handle the failure?
>>>>>
>>>>> With Regards
>>>>> Aman Tandon
>>>>>
>>>>> On Tue, Mar 3, 2015 at 5:17 PM, O. Klein <klein@octoweb.nl> wrote:
>>>>>
>>>>>  Zookeeper requires a majority of servers to be available. For
>>>>>>
>>>>> example:
>>>
>>>> Five
>>>>>
>>>>>> machines ZooKeeper can handle the failure of two machines. That's
why
>>>>>>
>>>>> odd
>>>>
>>>>> numbers are recommended.
>>>>>>
>>>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message