zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guy Laden <LA...@il.ibm.com>
Subject Zookeeper on VM's in public cloud
Date Wed, 29 Apr 2015 12:29:21 GMT
Hi All,

I wanted to get some feedback about running ZooKeeper on VM's within 
public clouds.
If you have experience with this could you share please?
What issues have you run into? Were you able to overcome the issues and 
At the end of the day, were you able to get this to work reliably?

Some of the issues we know we need to worry about:

Making sure replicas are in different 'availability zones'.
Without this your VM's might even be running on the same physical machine.

2. Lack of fixed IP
I believe typically in clouds every VM is allocated a new IP so if you're 
e.g. upgrading a cluster, 
you can't keep the existing IP's for the new VM's. Our solution is to use 
our cloud provider's support
for getting a set of fixed IP's which can be dynamically bound to 
whichever VM's we want. (aka "portable ip"
on SoftLayer, I believe there is similar support on other providers).

It's probably the case that dynamic reconfig opens up new options, but it 
will be a while before this is 
supported in a stable version. We prefer to use a stable Zookeeper, unless 
there is feedback that the 
pro's of using the more recent ZK versions are larger than the cons.

Isolation from other VM's on same physical machine. It seems especially 
important to good decent performance for the log disk.
Can be partially dealt with by allocating the log to a non-local disk with 
guaranteed IOP's, as
is supported by some providers.

4. Write caching of disk I/O.
Making sure there are no layers which cache disk writes so they do not 
really reach the disk even though they have been acknowledged.
Perhaps its not that big of an issue given the provider might have backup 
power? What are your thoughts here?

5. Clock-related issues on VM's. It seems people have seen VM clocks 
skipping ahead or even going backwards, which caused
e.g. ZooKeeper session disconnection.
We're not entirely clear what exactly we need to do to avoid this. Any 
help/pointer are appreciated.
Might be less of an issue in the more recent ZK versions but, again, these 
are not yet stable.
c.f. https://issues.apache.org/jira/browse/ZOOKEEPER-1616

Any additional issues to look out for?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message