On 05/03/2010 07:03 AM, Dave Wright wrote:
> I've got a situation where I essentially need dynamic cluster
> membership, which has been talked about in ZOOKEEPER-107 but doesn't
> look like it's going to happen any time soon.
>
Could you provide some insight into why you need this? Just so we have
addl background, I'm interested to know the use case.
> For now, I'm planning on working around this by having a simple
> coordinator service on the server nodes that will re-write the configs
> and bounce the servers when membership changes. Clients will may get
> an error or two and need to reconnect, but that should be handled by
> the normal error logic.
>
Are you expecting all of the servers to change each time, or just
incremental changes (add/remove a single server, vs say move the entire
cluster from 3 hosts a/b/c to x/y/z)
> On the client side, I'd really like to dynamically update the server
> list w/o having to re-create the entire Zookeeper object. Looking at
> the code, it seems like it would be pretty trivial to add
> "RemoveServer()/AddServer()" functions for Zookeeper that calls down
> to ClientCnxn, where they are just maintained in a list. Of course if
> the server being removed is the one currently connected, we'd need to
> disconnect, but a simple call to disconnect() seems like it would
> resolve that and trigger the automatic re-connection logic.
>
You would hook this (add/remove) into JMX? That seems like a good option
to provide.
Any chance you could use DNS for this? ie change the mapping for the
hostname from a -> x ip? Since the server a will go down anyway, this
would cause the client to reconnect to b/c (eventually when dns ttl
expires the client would also potentially connect to x).
If this is an option be sure to see (a bit of work to do):
https://issues.apache.org/jira/browse/ZOOKEEPER-328
https://issues.apache.org/jira/browse/ZOOKEEPER-338
You might also look at this patch, we never committed it but it might be
interesting to you:
https://issues.apache.org/jira/browse/ZOOKEEPER-146
The benefit is that you'd only have one place to make the change, esp
given that clients might be down/unreachable when this change occurs.
Clients would have to poll this service whenever they get disconnected
from the ensemble. One drawback of this approach is that the HTTP now
becomes a potential SPOF. (although I guess you could always fall back
to something, or potentially have a list of HTTP hosts to do the lookup,
etc...).
> Does anyone see an issue with that approach?
> Were I to create the patch, do you think it would be interesting
> enough to merge? It seems like that functionality will eventually be
> needed for whatever full dynamic server support is eventually
> implemented.
It does sound interesting, however once we add something like this it's
hard to change given that we try very hard to maintain b/w
compatibility. If you did the testing and were able to verify I don't
see why we couldn't add it - as it's "optional" in the sense that it
would only be called in the use case you describe. I would feel more
confident if we had more concrete detail on how we intend to do 107 (a
basic functional/design doc that at least reviews all the issues), and
how this would fit in. But I don't see that should necessarily be a
blocker (although others might feel differently).
(fyi it's good to discuss this sort of thing on zookeeper-dev, please
move responses to that list)
Sounds like an useful project, I'm interested to her what others think
about it. Regards,
Patrick
|