The FAQ lists Round-Robin as the recommended way to find a node to connect to...
http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to

As you say, your clients need to retry anyway. I have them hold the connection for a while (on the scale of minutes), then hit the DNS again and acquire a new connection. This lets them pickup new nodes and (i think over time) helps with keeping connections balanced around the cluster. 

If a node goes down for a shot time, it should not have too much of an affect on the clients. If you are taking a node out of the cluster you will need to update the DNS to remove it. 

Aaron


On 10 Aug, 2010,at 08:51 AM, Carsten Krebs <carsten.krebs@gmx.net> wrote:


On 08.08.2010, at 14:47 aaron morton wrote:
>
> What sort of client side load balancing where you thinking of? I just use round robin DNS to distribute clients around the cluster, and have them recycle their connections every so often.
>
I was thinking about to use this method to give the client to the ability to "learn" what nodes are part of the cluster. Using this information to automatically adapt the set of nodes used by the client if a new node is added to or respectively removed from the cluster.

Why do you prefer round robin DNS for load balancing?
One advantage I see is, that the client does not has to take care about the node set and especially the management of the node set. The reason why I was thinking about a client side load balancing was to avoid the need to write additional tools, to monitor all nodes in the cluster and changing the DNS entry if any node fails - and this as fast as possible to prevent the clients from trying to use a dead node. But the time writing this, I doesn't think anymore, that this is good point. This is just a point of some sort of retry logic, which is needed anyway in the client.

Carsten