hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Dynamic adding/removing ZK servers on client
Date Mon, 03 May 2010 16:34:34 GMT

On 05/03/2010 07:03 AM, Dave Wright wrote:
> I've got a situation where I essentially need dynamic cluster
> membership, which has been talked about in ZOOKEEPER-107 but doesn't
> look like it's going to happen any time soon.

Could you provide some insight into why you need this? Just so we have 
addl background, I'm interested to know the use case.

> For now, I'm planning on working around this by having a simple
> coordinator service on the server nodes that will re-write the configs
> and bounce the servers when membership changes. Clients will may get
> an error or two and need to reconnect, but that should be handled by
> the normal error logic.

Are you expecting all of the servers to change each time, or just 
incremental changes (add/remove a single server, vs say move the entire 
cluster from 3 hosts a/b/c to x/y/z)

> On the client side, I'd really like to dynamically update the server
> list w/o having to re-create the entire Zookeeper object. Looking at
> the code, it seems like it would be pretty trivial to add
> "RemoveServer()/AddServer()" functions for Zookeeper that calls down
> to ClientCnxn, where they are just maintained in a list. Of course if
> the server being removed is the one currently connected, we'd need to
> disconnect, but a simple call to disconnect() seems like it would
> resolve that and trigger the automatic re-connection logic.

You would hook this (add/remove) into JMX? That seems like a good option 
to provide.

Any chance you could use DNS for this? ie change the mapping for the 
hostname from a -> x ip? Since the server a will go down anyway, this 
would cause the client to reconnect to b/c (eventually when dns ttl 
expires the client would also potentially connect to x).

If this is an option be sure to see (a bit of work to do):

You might also look at this patch, we never committed it but it might be 
interesting to you:

The benefit is that you'd only have one place to make the change, esp 
given that clients might be down/unreachable when this change occurs. 
Clients would have to poll this service whenever they get disconnected 
from the ensemble. One drawback of this approach is that the HTTP now 
becomes a potential SPOF. (although I guess you could always fall back 
to something, or potentially have a list of HTTP hosts to do the lookup, 

> Does anyone see an issue with that approach?
> Were I to create the patch, do you think it would be interesting
> enough to merge? It seems like that functionality will eventually be
> needed for whatever full dynamic server support is eventually
> implemented.

It does sound interesting, however once we add something like this it's 
hard to change given that we try very hard to maintain b/w 
compatibility. If you did the testing and were able to verify I don't 
see why we couldn't add it - as it's "optional" in the sense that it 
would only be called in the use case you describe. I would feel more 
confident if we had more concrete detail on how we intend to do 107 (a 
basic functional/design doc that at least reviews all the issues), and 
how this would fit in. But I don't see that should necessarily be a 
blocker (although others might feel differently).

(fyi it's good to discuss this sort of thing on zookeeper-dev, please 
move responses to that list)

Sounds like an useful project, I'm interested to her what others think 
about it. Regards,


View raw message