zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Pierre Koenig <jean-pierre.koe...@memonews.com>
Subject Re: Zookeeper and multiple data centers
Date Mon, 09 Jul 2012 12:14:44 GMT
Hi Michael,

as you said, ZK does not require high end server hardware. Neither the
number of clients nor the size of your ZK quorum is a problem.

But you should beware of large payload. ZK is not designed to handle
huge amount of data and 512MB is much more than huge. Since ZK ack is
required by the majority of quorum (write requests) your data is
shipped throught the entire zk cluster. I highly recommend not more
than 1024 KB payload.  The other point you should consider here is
(network) latency. i guess your ZK clients (your 50) will see a lot of
SessionExpired or ConnectionLoss exceptions, depending on the
connectivity of your DC's among one another.

Regards, JP

On Mon, Jul 9, 2012 at 2:01 PM, Michael Morello
<michael.morello@gmail.com> wrote:
> Hi all,
> I work on a project and I would be happy to have your thoughts about our
> requirements and how Zookeeper meets them.
> The facts :
> * We need to share configuration items between 10 data centers.
> Configuration must be synchronized between data centers (actually we can
> tolerate a few seconds of inconsistency)
> * Configuration items will be serialized in JSon and together they can fit
> into 256MB of heap
> * R/W ratio is 90% read and 10% write and client number should be low (50
> to 100 in each data center)
> * A client running in a DC can freely communicate with a host in an other DC
> * Latency between data center is 20 to 60 ms
> * Only 1 host (machine) per data center might be dedicated to a Zookeeper
> process : machines are big IBM AIX boxes only one is dedicated for this
> project in each DC
> * Project must survive a data center crash
> Since configuration items are small and they must be synchronized and we
> need a fail-over mechanism Zookeeper appears to be a good candidate, but
> i'm not sure how to deploy it mainly because we have to start only one
> Zookeeper process in each data center.
> My idea is to deploy 1 follower in only 5 DC. This way there are 5
> followers all over the country and we can lost 2 DC). Of course all the
> clients on all the data centers must know where are the 5 zookeeper servers.
> Do you see any downside to do this ?
> I know that Zookeeper has been designed to run on a LAN and on "commodity
> hardware" but regarding the R/W ratio and the latency do you think that it
> is a good idea to deploy it this way ?
> Thanks for your comments
> Best regards,
> Michael

Jean-Pierre Koenig
Head of Technology

MeMo News AG
Sonnenstr. 4
CH-8280 Kreuzlingen

Tel: +41 71 508 24 86
Fax: +41 71 671 20 26
E-Mail: jean-pierre.koenig@memonews.com


View raw message