zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Repair cluster on EC2
Date Mon, 11 Apr 2011 15:14:17 GMT
On Mon, Apr 11, 2011 at 4:43 AM, Andrei Savu <savu.andrei@gmail.com> wrote:

> Is it possible to repair a ZooKeeper cluster on EC2 by using the
> following algorithm with no downtime and data loss?
>

Yes.  Been there.  Done that.  Works like a champ.

1. start a cluster with >3 nodes
> 2. if one node fails start a new machine and record the new IP
> 3. rebuild the configuration file by replacing the IP of the node that
> failed with the IP attached to the new machine
> 4. do a rolling restart and replace all configuration files
>
> Am I missing something? Could this process be executed by a script?
>

Sounds right and you should be able to do it with a script.

Use caution, of course.


> I'm also thinking about extending the client library in order to make
> it EC2 aware (it should be able to automatically discover ZK nodes).
>

This way lies danger!

The problem is that is cluster membership becomes very flexible then you run
the risk of diluting the guarantees that ZK provides based on the quorum
requirements.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message