zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fournier, Camille F. [Tech]" <Camille.Fourn...@gs.com>
Subject RE: Importance of latency in a global deployment
Date Wed, 04 May 2011 21:11:13 GMT
If you had a very heavy read load with a light write load, I think you should be able do this
with observers as the regional tier. I solved the problem of needing a global service a different
way (though api concepts) because of concerns around WAN traffic. Didn't do much server config
tuning as a result. I thought a long time about whether I could use the hierarchical quorum
to get around this but for my particular use case it wasn't useful. It would be interesting
to see the limitations of a truly "global" ZK cluster deployment; my business is too sensitive
to outage for me to be the one to roll those dice. Also I suspect it depends a lot on how
good the pipes you have between your global datacenters are. 

-----Original Message-----
From: Oliver Wulff [mailto:ownaish@gmail.com] 
Sent: Wednesday, May 04, 2011 3:44 PM
To: user@zookeeper.apache.org
Subject: Re: Importance of latency in a global deployment

Looking forward for the feedback from Camille...

Maybe a crazy idea but couldn't we implement something similar like DNS. We
have one top level cluster (at least three servers) and then a child cluster
for each geographical region. The Zookeeper client communicates with the
local cluster only.


2011/5/4 Patrick Hunt <phunt@apache.org>

> Camille did you tune any of the server configuration parameters? I
> think this would be interesting/useful for ppl.
> You are correct about write latency and issues wrt a client's server
> selection. This jira introduces the idea of allowing addl connection
> strategies
> https://issues.apache.org/jira/browse/ZOOKEEPER-781
> which for this case might be interesting - the client would attempt to
> connect to the "closest available server", fail over to a far server
> if necessary, but then keep checking for closer servers to become
> available over time (say the server recovers). Today you would fail
> over to another (potentially far) server, but never reconnect back to
> the closer server.
> Patrick
> On Wed, May 4, 2011 at 9:55 AM, Fournier, Camille F. [Tech]
> <Camille.Fournier@gs.com> wrote:
>  > Global clusters will affect writes greatly, and may also affect you
> client reads in an indirect manner.
> > Writes, having to traverse from one region to another for purposes of
> voting, will be slowed down considerably by the ping time between regions.
> > If you did a three node deployment in the manner you mentioned, your
> clients may also suffer. Usually you would want to have a list of all
> available cluster members for your client to connect to, so if one is down
> or goes down the client can fail over to a running node. However, given that
> your client will have regional affinity for at most one of your servers, if
> you use the standard zk client connection logic your client either may be
> connected to a far region (slowing down all responses due to latency) or the
> client would have no failover node available should their close region node
> fail. If you choose to have clients able to connect to any node you may also
> have wan traffic considerations.
> >
> > Some of the client side issues may be alleviated by using observers.
> >
> > I've got deployments across regions to handle data center failure, but in
> all cases the off-region member is not available for client connections, and
> is kept from acting as leader to prevent slowness on writes.
> >
> > C
> >
> > ----- Original Message -----
> > From: Oliver Wulff <ownaish@gmail.com>
> > To: user@zookeeper.apache.org <user@zookeeper.apache.org>
> > Sent: Wed May 04 12:37:43 2011
> > Subject: Importance of latency in a global deployment
> >
> > Hi there
> >
> > I'm quite new to the zookeeper project and got a question regarding
> > robustness of the failover functionality in a global deployment.
> >
> > Are there any pre-conditions how close the zookeeper servers must be to
> each
> > other from a geographical distance point of view?
> > The reason is that the servers have to monitor and sync with each other
> in
> > realtime and the latency might play an important role if for instance one
> > server is in the US, one in Europe and one in China.
> >
> > Thanks
> > Oli
> >

View raw message