hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Dynamic adding/removing ZK servers on client
Date Mon, 03 May 2010 16:34:34 GMT

On 05/03/2010 07:03 AM, Dave Wright wrote:
> I've got a situation where I essentially need dynamic cluster
> membership, which has been talked about in ZOOKEEPER-107 but doesn't
> look like it's going to happen any time soon.
>

Could you provide some insight into why you need this? Just so we have 
addl background, I'm interested to know the use case.

> For now, I'm planning on working around this by having a simple
> coordinator service on the server nodes that will re-write the configs
> and bounce the servers when membership changes. Clients will may get
> an error or two and need to reconnect, but that should be handled by
> the normal error logic.
>

Are you expecting all of the servers to change each time, or just 
incremental changes (add/remove a single server, vs say move the entire 
cluster from 3 hosts a/b/c to x/y/z)

> On the client side, I'd really like to dynamically update the server
> list w/o having to re-create the entire Zookeeper object. Looking at
> the code, it seems like it would be pretty trivial to add
> "RemoveServer()/AddServer()" functions for Zookeeper that calls down
> to ClientCnxn, where they are just maintained in a list. Of course if
> the server being removed is the one currently connected, we'd need to
> disconnect, but a simple call to disconnect() seems like it would
> resolve that and trigger the automatic re-connection logic.
>

You would hook this (add/remove) into JMX? That seems like a good option 
to provide.

Any chance you could use DNS for this? ie change the mapping for the 
hostname from a -> x ip? Since the server a will go down anyway, this 
would cause the client to reconnect to b/c (eventually when dns ttl 
expires the client would also potentially connect to x).

If this is an option be sure to see (a bit of work to do):
https://issues.apache.org/jira/browse/ZOOKEEPER-328
https://issues.apache.org/jira/browse/ZOOKEEPER-338

You might also look at this patch, we never committed it but it might be 
interesting to you:
https://issues.apache.org/jira/browse/ZOOKEEPER-146

The benefit is that you'd only have one place to make the change, esp 
given that clients might be down/unreachable when this change occurs. 
Clients would have to poll this service whenever they get disconnected 
from the ensemble. One drawback of this approach is that the HTTP now 
becomes a potential SPOF. (although I guess you could always fall back 
to something, or potentially have a list of HTTP hosts to do the lookup, 
etc...).

> Does anyone see an issue with that approach?
> Were I to create the patch, do you think it would be interesting
> enough to merge? It seems like that functionality will eventually be
> needed for whatever full dynamic server support is eventually
> implemented.

It does sound interesting, however once we add something like this it's 
hard to change given that we try very hard to maintain b/w 
compatibility. If you did the testing and were able to verify I don't 
see why we couldn't add it - as it's "optional" in the sense that it 
would only be called in the use case you describe. I would feel more 
confident if we had more concrete detail on how we intend to do 107 (a 
basic functional/design doc that at least reviews all the issues), and 
how this would fit in. But I don't see that should necessarily be a 
blocker (although others might feel differently).

(fyi it's good to discuss this sort of thing on zookeeper-dev, please 
move responses to that list)

Sounds like an useful project, I'm interested to her what others think 
about it. Regards,

Patrick

Mime
View raw message