zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dima Gutzeit <dima.gutz...@mailvision.com>
Subject Re: Distributed ZooKeeper cluster design
Date Tue, 13 Dec 2011 16:09:48 GMT
Ted and Camille,

Thanks for a very details response.

At the moment I have an option A implemented in production and what I see
is that ZK client in A and B have a "slow" performance (even reads) and I
can't really blame the network since it does not look like a real

I wonder if doing option 2 will improve the ZK client performance/speed ...

As for my use case, its around 50/50 reads and writes.

As for fallback, ofcourse in A and B I would want to define C as a backup,
not sure how it can be done since as I understand if I supply several
addresses in the connection string the client will use one, randomly.

About Ted's suggestion to consider having several clusters and to have a
special process to mirror, is it something available as part of ZooKeeper ?

I also read about observers (is it available in 3.3.3 ?) and it seems to be
a good option is my case, which brings me to the question of how to
configure explicit fallback instead of random client selection ? If I want
to tell ZK client in B to use the local B instance (observer) and if it
fails then contact ANY server in the C (with a list of several).

Thanks in advance.

Dima Gutzeit.

On Tue, Dec 13, 2011 at 5:44 PM, Camille Fournier <camille@apache.org>wrote:

> Ted is of course right, but to speculate:
> The idea you had with 3 in C, one in A and one in B isn't bad, given
> some caveats.
> With 3 in C, as long as they are all available, quorum should live in
> C and you shouldn't have much slowdown from the remote servers in A
> and B. However, if you point your A servers only to the A zookeeper,
> you have a failover risk where your A servers will have no ZK if the
> sever in region A goes down (same with B, of course). If you have a
> lot of servers in the outer regions, this could be a risk. You are
> also giving up any kind of load balancing for the A and B region ZKs,
> which may not be important but is good to know.
> Another thing to be aware of is that the A and B region ZKs will have
> slower write response time due to the WAN cost, and they will tend to
> lag behind the majority cluster a bit. This shouldn't cause
> correctness issues but could impact client performance in those
> regions.
> Honestly, if you're doing a read-mostly workload in the A and B
> regions, I doubt this is a bad design. It's pretty easy to test ZK
> setups using Pat's zksmoketest utility, so you might try setting up
> the sample cluster and running some of the smoketests on it.
> (https://github.com/phunt/zk-smoketest/blob/master/zk-smoketest.py).
> You could maybe also add observers in the outer regions to improve
> client load balancing.
> C
> On Tue, Dec 13, 2011 at 9:05 AM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > Which option is preferred really depends on your needs.
> >
> > Those needs are likely to vary in read/write ratios, resistance to
> network
> > and so on.  You should also consider the possibility of observers in the
> > remote locations.  You might also consider separate ZK clusters in each
> > location with a special process to send mirrors of changes to these other
> > locations.
> >
> > A complete and detailed answer really isn't possible without knowing the
> > details of your application.  I generally don't like distributing a ZK
> > cluster across distant hosts because it makes everything slower and more
> > delicate, but I have heard of examples where that is exactly the right
> > answer.
> >
> > On Tue, Dec 13, 2011 at 4:29 AM, Dima Gutzeit
> > <dima.gutzeit@mailvision.com>wrote:
> >
> >> Dear list members,
> >>
> >> I have a question related to "suggested" way of working with ZooKeeper
> >> cluster from different geographical locations.
> >>
> >> Lets assume a service span across several regions, A, B and C, while C
> is
> >> defined as an element that the service can not live without and A and B
> are
> >> not critical.
> >>
> >> Option one:
> >>
> >> Having one cluster of several ZooKeeper nodes in one location (C) and
> >> accessing that from other locations A,B,C.
> >>
> >> Option two:
> >>
> >> Having ZooKeeper cluster span across all regions, i.e. 3 nodes in C,
> one in
> >> A and one in B. This way the clients resides in A,B will access the
> local
> >> ZooKeeper.
> >>
> >> Which option is preferred and which will work faster from client
> >> perspective ?
> >>
> >> Thanks in advance.
> >>
> >> Regards,
> >> Dima Gutzeit
> >>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message