cassandra-client-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Detecting cluster membership
Date Wed, 01 Sep 2010 03:09:20 GMT
Looks like I've got the wrong idea here or something is wrong, am guessing the former. 

describe_ring() does not seem to return the same level of info as nodetool ring. It's including
nodes that are down, e..g I ran this on a node...

nodetool -h localhost -p 8080 ring
Address         Status State   Load            Token                  
                    
                                       170141183460469231731687303715884105728
   
192.168.34.26   Up     Normal  35.07 GB        42535295865117307932921825928971026432
     
192.168.34.27   Up     Normal  34.63 GB        85070591730234615865843651857942052864
     
192.168.34.28   Up     Normal  34.89 GB        127605887595351923798765477786913079296
    
192.168.34.29   Down   Normal  36.55 GB        170141183460469231731687303715884105728
  

also also ran describe_ring against the same node

[TokenRangeValue(start_token='170141183460469231731687303715884105728', end_token='42535295865117307932921825928971026432',
endpoints=['192.168.34.28', '192.168.34.26', '192.168.34.27']),
 TokenRangeValue(start_token='85070591730234615865843651857942052864', end_token='127605887595351923798765477786913079296',
endpoints=['192.168.34.28', '192.168.34.29', '192.168.34.26']),
 TokenRangeValue(start_token='127605887595351923798765477786913079296', end_token='170141183460469231731687303715884105728',
endpoints=['192.168.34.29', '192.168.34.26', '192.168.34.27']),
 TokenRangeValue(start_token='42535295865117307932921825928971026432', end_token='85070591730234615865843651857942052864',
endpoints=['192.168.34.28', '192.168.34.29', '192.168.34.27'])]

Need to some more more investigation. 

Aaron


On 01 Sep, 2010,at 02:37 PM, Dave Viner <daveviner@pobox.com> wrote:

I know this was recently bashed on the user list, but I still like the idea
of using haproxy (or other load balancer).

I think it would be very useful to have a simple haproxy configuration which
used a list of ip's to start, then dynamically updated itself using
describe_ring() to keep a relatively accurate up-to-the-minute list of which
nodes are available.

I assume this is roughly how the connection pooling classes would attempt
it.

Dave Viner


On Tue, Aug 31, 2010 at 7:22 PM, Aaron Morton <aaron@thelastpickle.com>wrote:

> Ops, should be
>
> Round robin DNS is just the *wrong* approach, when doing things like
> draining a node or turning a box off.
>
>
> Aaron
>
> On 01 Sep, 2010,at 02:18 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
>
> I agree.
>
> Round robin DNS is just the right approach, when doing things like draining
> a node or turning a box off.
>
> Am moving to use describe_ring and a either a seed list or dns. My concern
> with using a seed list in the client is the chance that the seeds are down,
> I have 2 seeds in a 4 node cluster so with both seeds down the cluster
> should still have quorum.
>
> So may go with regular round robin DNS when the client needs refresh it's
> list of nodes, e.g. when starting up.
>
> Have the client hold a list of the up nodes returned from describe_ring(),
> that it shuffles and then round robins. The list would be refreshed
> periodically.
>
> I also have the client periodically obtain a new connection to avoid the
> connections getting clumped in one area of the ring.
>
> (I'm working on an in house Python client that I hope to make public).
>
> Aaron
>
> On 01 Sep, 2010,at 02:04 PM, Dan Washusen <dan@reactive.org> wrote:
>
> The Pelops provides a connection pooling impl that's using (or attempting
> to
> use) the second approach, but to be honest it needs a significant amount of
> testing before I'd be willing to go into production with it...
>
> IMO, the connection pooling/node failure/etc logic is by far the most
> complex part of a client library. It would be excellent if we could avoid
> re-inventing the wheel when attempting to create a solution to solve it.
>
> Cheers,
> Dan
>
> On Wed, Sep 1, 2010 at 11:35 AM, Aaron Morton <aaron@thelastpickle.com
> >wrote:
>
> > When I first started writing code against the thrift API the FAQ
> > recommended using a round robin DNS to select nodes
> > http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to
> >
> > <http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to>The other
> > day Ben said something like "well behaved clients use describe_ring to
> keep
> > track of running nodes".
> > http://www.mail-archive.com/user@cassandra.apache.org/msg05588.html
> >
> > So am wondering what approach people are taking to detecting cluster
> > membership.
> >
> > 1. Round Robin
> >
> > 2. List seeds in app config, connect to a seed, use describe_ring.
> >
> > 3. Round robin and describe_ring
> >
> > One issue I've found with round robin, is that is the machine is powered
> > off it can take a while for the network to work out there is no ARP for
> the
> > IP. This may just be a result of the network here, have not looked into
> it
> > too far.
> >
> > cheers
> > Aaron
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message