zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fournier, Camille F." <Camille.Fourn...@gs.com>
Subject RE: zookeeper cluster spanning datacenters
Date Thu, 22 Sep 2011 15:03:59 GMT
We spread our ZKs across 3 data centers and in fact, these data centers are split across global
regions (2 or 4 in one region, one in a remote region). To keep throughput up (and note that
the throughput you have to worry about is only write throughput), we always ensure that the
master is in one of the "local" data centers. 

If you have a very write-heavy and write time sensitive load, this might affect your performance.
It won't affect reads at all because reads are serviced from the memory of the zk you connect
to. For a mostly read-intensive load, splitting across data centers is unlikely to cause you

There is one exception: Monitoring. Even across data centers in the same region, we sometimes
see zk dashboard unable to properly monitor the leader of a heavily-utilized cluster. This
is due to the way the 4lw connections are managed, and something I'm trying to fix. 

If you have the machines to test, I would recommend running zk-smoketest  (https://github.com/phunt/zk-smoketest)
on the proposed config.


-----Original Message-----
From: Damu R [mailto:damu.devnull@gmail.com] 
Sent: Thursday, September 22, 2011 10:50 AM
To: user@zookeeper.apache.org
Subject: zookeeper cluster spanning datacenters

I would like to know the downsides of having a zookeeper cluster that spans
multiple datacenters. The requirement is a datacenter failure should not
bring down the zookeeper cluster. From my understanding it is not possible
to have a hot/cold cluster kind of setup possible. So we are thinking of
putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major drawback I
could think of is the throughput of the system affected by latency. The
system does not require high throughput and can accept some latency. How
much effect will the latency have on the throughput of the system? What are
the other downsides of spreading the cluster across datacenters?


View raw message