cassandra-client-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: Detecting cluster membership
Date Wed, 01 Sep 2010 02:18:55 GMT
I agree. 

Round robin DNS is just the right approach, when doing things like draining a node or turning
a box off. 

Am moving to use describe_ring and a either a seed list or dns. My concern with using a seed
list in the client is the chance that the seeds are down, I have 2 seeds in a 4 node cluster
so with both seeds down the cluster should still have quorum. 

So may go with regular round robin DNS when the client needs refresh it's list of nodes, e.g.
when starting up. 

Have the client hold a list of the up nodes returned from describe_ring(), that it shuffles
and then round robins. The list would be refreshed periodically.

I also have the client periodically obtain a new connection to avoid the connections getting
clumped in one area of the ring. 

(I'm working on an in house Python client that I hope to make public).   


On 01 Sep, 2010,at 02:04 PM, Dan Washusen <> wrote:

The Pelops provides a connection pooling impl that's using (or attempting to
use) the second approach, but to be honest it needs a significant amount of
testing before I'd be willing to go into production with it...

IMO, the connection pooling/node failure/etc logic is by far the most
complex part of a client library. It would be excellent if we could avoid
re-inventing the wheel when attempting to create a solution to solve it.


On Wed, Sep 1, 2010 at 11:35 AM, Aaron Morton <>wrote:

> When I first started writing code against the thrift API the FAQ
> recommended using a round robin DNS to select nodes
> <>The other
> day Ben said something like "well behaved clients use describe_ring to keep
> track of running nodes".
> So am wondering what approach people are taking to detecting cluster
> membership.
> 1. Round Robin
> 2. List seeds in app config, connect to a seed, use describe_ring.
> 3. Round robin and describe_ring
> One issue I've found with round robin, is that is the machine is powered
> off it can take a while for the network to work out there is no ARP for the
> IP. This may just be a result of the network here, have not looked into it
> too far.
> cheers
> Aaron

  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message