zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Camille Fournier <cami...@apache.org>
Subject Re: Distributed ZooKeeper cluster design
Date Thu, 15 Dec 2011 01:41:23 GMT
I ran ZK in prod over the WAN and it was fine. If you have a modest read
and write load, I would bet you'lll have no problem. I don't have numbers
in front of me but we tested clients and got totally acceptable
throughputs. It's really hard to generalize because it depends on the link
between your clusters, etc, so as always I recommend running some basic
load testing against the proposed cluster and seeing how it performs.

C

On Wed, Dec 14, 2011 at 6:04 PM, Yongsheng Wu <yongsheng.wu@gmail.com>wrote:

> We have a user case that makes sense to have distributed zookeeper across
> regions. That is to use zookeeper to enforce uniqueness across regions, and
> we want to be able to tolerate any single region failure. In our case, both
> reads and writes are modest.
>
> I wonder if anyone on this list actually have zookeeper cluster set up
> across WAN on production, and I'd be very interested in knowing what kind
> of performance you are getting from the cluster, and what
> reliability/availability are like.
>
> Thanks a lot if someone can share information as such.
>
> Yongsheng
>
> On Tue, Dec 13, 2011 at 6:50 PM, Dima Gutzeit
> <dima.gutzeit@mailvision.com>wrote:
>
> > To summarize the discussion:
> >
> > As I understand the "preferred" approach will be having a quorum in C and
> > at least two observers in other regions (for HA).
> >
> > Each node's clients must talk to its local ZK servers.
> >
> > This approach will decrease the traffic between the regions and increase
> > the speed of ZK clients accessing local nodes.
> >
> > Thanks so much for the suggestions.
> >
> > Regards,
> > Dima Gutzeit.
> >
> > On Wed, Dec 14, 2011 at 2:36 AM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> >
> > > I am happy to agree with everyone that the mirroring isn't a great idea
> > for
> > > most things even if that makes me look like I disagree with myself.
> > >
> > > I do think that mirroring could be made to happen in a reliable way,
> but
> > it
> > > isn't going to be a viable substitute for direct access to the cluster.
> >  By
> > > reliable, I think that you could get a reliable picture of what was in
> > the
> > > master cluster at some time in the past.  Occasionally the mirror would
> > be
> > > further behind than other times and it might be necessary for the
> mirror
> > to
> > > be updated much faster than real-time.  In my vision, the mirror would
> be
> > > read-only since anything else leads to madness in the strict
> consistency
> > > model that ZK maintains.
> > >
> > > On Tue, Dec 13, 2011 at 2:57 PM, Benjamin Reed <breed@apache.org>
> wrote:
> > >
> > > > i agree with camille that mirror breaks a lot of the basic guarantees
> > > > that you use from zookeeper. with that caveat in mind, there is a
> > > > patch that enables mirroring: ZOOKEEPER-892.
> > > >
> > > > ben
> > > >
> > > > On Tue, Dec 13, 2011 at 8:24 AM, Camille Fournier <
> camille@apache.org>
> > > > wrote:
> > > > > I have to strongly disagree with ted on the mirroring idea... I
> think
> > > it
> > > > is
> > > > > likely to be really error-prone and kind of defeats the purpose of
> ZK
> > > in
> > > > my
> > > > > mind. It depends on what you're mirroring but if you're trying to
> > keep
> > > > all
> > > > > the data coherent you can't sensibly do that in two clusters, so
> > unless
> > > > the
> > > > > mirror is for a really small subset of the data I would stay far
> far
> > > away
> > > > > from that.
> > > > >
> > > > > Observers are available in 3.3.3, yes.
> > > > > Unfortunately, we don't have configurable connection logic in ZK
> > client
> > > > (at
> > > > > least java) right now. We have the ability to add it pretty easily,
> > but
> > > > it
> > > > > hasn't been put in yet.
> > > > >
> > > > > You're seeing slow performance for a setup that has all ZK servers
> in
> > > > > region C for clients only in regions A and B and you can't blame
> the
> > > > > network? That's literally the only thing you could blame, unless
> > > clients
> > > > in
> > > > > region C were also seeing slow performance or they have some other
> > > > problem
> > > > > in they way they are implemented that makes them different from
> > clients
> > > > > running in region C.
> > > > >
> > > > > C
> > > > >
> > > > > On Tue, Dec 13, 2011 at 11:09 AM, Dima Gutzeit
> > > > > <dima.gutzeit@mailvision.com>wrote:
> > > > >
> > > > >> Ted and Camille,
> > > > >>
> > > > >> Thanks for a very details response.
> > > > >>
> > > > >> At the moment I have an option A implemented in production and
> what
> > I
> > > > see
> > > > >> is that ZK client in A and B have a "slow" performance (even
> reads)
> > > and
> > > > I
> > > > >> can't really blame the network since it does not look like a
real
> > > > >> bottleneck.
> > > > >>
> > > > >> I wonder if doing option 2 will improve the ZK client
> > > performance/speed
> > > > ...
> > > > >>
> > > > >> As for my use case, its around 50/50 reads and writes.
> > > > >>
> > > > >> As for fallback, ofcourse in A and B I would want to define C
as a
> > > > backup,
> > > > >> not sure how it can be done since as I understand if I supply
> > several
> > > > >> addresses in the connection string the client will use one,
> > randomly.
> > > > >>
> > > > >> About Ted's suggestion to consider having several clusters and
to
> > > have a
> > > > >> special process to mirror, is it something available as part
of
> > > > ZooKeeper ?
> > > > >>
> > > > >> I also read about observers (is it available in 3.3.3 ?) and
it
> > seems
> > > > to be
> > > > >> a good option is my case, which brings me to the question of
how
> to
> > > > >> configure explicit fallback instead of random client selection
?
> If
> > I
> > > > want
> > > > >> to tell ZK client in B to use the local B instance (observer)
and
> if
> > > it
> > > > >> fails then contact ANY server in the C (with a list of several).
> > > > >>
> > > > >> Thanks in advance.
> > > > >>
> > > > >> Regards,
> > > > >> Dima Gutzeit.
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, Dec 13, 2011 at 5:44 PM, Camille Fournier <
> > camille@apache.org
> > > > >> >wrote:
> > > > >>
> > > > >> > Ted is of course right, but to speculate:
> > > > >> >
> > > > >> > The idea you had with 3 in C, one in A and one in B isn't
bad,
> > given
> > > > >> > some caveats.
> > > > >> >
> > > > >> > With 3 in C, as long as they are all available, quorum should
> live
> > > in
> > > > >> > C and you shouldn't have much slowdown from the remote servers
> in
> > A
> > > > >> > and B. However, if you point your A servers only to the
A
> > zookeeper,
> > > > >> > you have a failover risk where your A servers will have
no ZK if
> > the
> > > > >> > sever in region A goes down (same with B, of course). If
you
> have
> > a
> > > > >> > lot of servers in the outer regions, this could be a risk.
You
> are
> > > > >> > also giving up any kind of load balancing for the A and
B region
> > > ZKs,
> > > > >> > which may not be important but is good to know.
> > > > >> >
> > > > >> > Another thing to be aware of is that the A and B region
ZKs will
> > > have
> > > > >> > slower write response time due to the WAN cost, and they
will
> tend
> > > to
> > > > >> > lag behind the majority cluster a bit. This shouldn't cause
> > > > >> > correctness issues but could impact client performance in
those
> > > > >> > regions.
> > > > >> >
> > > > >> > Honestly, if you're doing a read-mostly workload in the
A and B
> > > > >> > regions, I doubt this is a bad design. It's pretty easy
to test
> ZK
> > > > >> > setups using Pat's zksmoketest utility, so you might try
setting
> > up
> > > > >> > the sample cluster and running some of the smoketests on
it.
> > > > >> > (
> > https://github.com/phunt/zk-smoketest/blob/master/zk-smoketest.py
> > > ).
> > > > >> > You could maybe also add observers in the outer regions
to
> improve
> > > > >> > client load balancing.
> > > > >> >
> > > > >> > C
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > On Tue, Dec 13, 2011 at 9:05 AM, Ted Dunning <
> > ted.dunning@gmail.com
> > > >
> > > > >> > wrote:
> > > > >> > > Which option is preferred really depends on your needs.
> > > > >> > >
> > > > >> > > Those needs are likely to vary in read/write ratios,
> resistance
> > to
> > > > >> > network
> > > > >> > > and so on.  You should also consider the possibility
of
> > observers
> > > in
> > > > >> the
> > > > >> > > remote locations.  You might also consider separate
ZK
> clusters
> > in
> > > > each
> > > > >> > > location with a special process to send mirrors of
changes to
> > > these
> > > > >> other
> > > > >> > > locations.
> > > > >> > >
> > > > >> > > A complete and detailed answer really isn't possible
without
> > > knowing
> > > > >> the
> > > > >> > > details of your application.  I generally don't like
> > distributing
> > > a
> > > > ZK
> > > > >> > > cluster across distant hosts because it makes everything
> slower
> > > and
> > > > >> more
> > > > >> > > delicate, but I have heard of examples where that is
exactly
> the
> > > > right
> > > > >> > > answer.
> > > > >> > >
> > > > >> > > On Tue, Dec 13, 2011 at 4:29 AM, Dima Gutzeit
> > > > >> > > <dima.gutzeit@mailvision.com>wrote:
> > > > >> > >
> > > > >> > >> Dear list members,
> > > > >> > >>
> > > > >> > >> I have a question related to "suggested" way of
working with
> > > > ZooKeeper
> > > > >> > >> cluster from different geographical locations.
> > > > >> > >>
> > > > >> > >> Lets assume a service span across several regions,
A, B and
> C,
> > > > while C
> > > > >> > is
> > > > >> > >> defined as an element that the service can not
live without
> > and A
> > > > and
> > > > >> B
> > > > >> > are
> > > > >> > >> not critical.
> > > > >> > >>
> > > > >> > >> Option one:
> > > > >> > >>
> > > > >> > >> Having one cluster of several ZooKeeper nodes in
one location
> > (C)
> > > > and
> > > > >> > >> accessing that from other locations A,B,C.
> > > > >> > >>
> > > > >> > >> Option two:
> > > > >> > >>
> > > > >> > >> Having ZooKeeper cluster span across all regions,
i.e. 3
> nodes
> > in
> > > > C,
> > > > >> > one in
> > > > >> > >> A and one in B. This way the clients resides in
A,B will
> access
> > > the
> > > > >> > local
> > > > >> > >> ZooKeeper.
> > > > >> > >>
> > > > >> > >> Which option is preferred and which will work faster
from
> > client
> > > > >> > >> perspective ?
> > > > >> > >>
> > > > >> > >> Thanks in advance.
> > > > >> > >>
> > > > >> > >> Regards,
> > > > >> > >> Dima Gutzeit
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message