I have a blog post on this topic:
http://whilefalse.blogspot.com/2012/12/building-global-highly-available.html
I think you will find it helpful.
The short answer is: the scheme you have proposed will cause the ZK to be
unavailable when you do maintenance on the data center with 4 quorum
members.
Best,
C
On Tue, Oct 21, 2014 at 3:03 PM, Denis Samoilov <samoilov@gmail.com> wrote:
> hi,
>
> Could you please help to understand the following setup: we have two
> datacenters and want to setup ZK cluster so it will use servers (ZK servers
> not clients) in both: like 3 ZK servers in DC1 and 4 ZK servers in DC2. We
> sometime do maintenance in one or other DC. So ZK will completely lose
> replicas in one of the DC for several hours. E.g. if DC2 is under
> maintenance ZK will have only 3 out of 7 nodes and these 3 nodes supposed
> to receive writes.
>
> The questions:
> 1) is it Ok for ZK to have such setup?
> 2) will ZK catch up after losing 4 Servers and getting them back in some
> time? (this will be a majority actually :) )
> 3) what is right number of nodes, is 5 sufficient : 2 + 3?
>
> Latency between DCs is pretty low (DCs are close to each other).
>
>
> Thank you for any advice.
>
> -Denis
>
|