cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan King <r...@twitter.com>
Subject Re: Propose new ConsistencyLevel.ALL_AVAIL for reads
Date Thu, 16 Jun 2011 20:37:33 GMT
On Thu, Jun 16, 2011 at 1:05 PM, AJ <aj@dude.podzone.net> wrote:
> On 6/16/2011 10:58 AM, Dan Hendry wrote:
>>
>> I think this would add a lot of complexity behind the scenes and be
>> conceptually confusing, particularly for new users.
>
> I'm not so sure about this.  Cass is already somewhat sophisticated and I
> don't see how this could trip-up anyone who can already grasp the basics.
>  The only thing I am adding to the CL concept is the concept of available
> replication nodes, versus total replication nodes.  But, don't forget; a
> competitor to Cass is probably in the works this very minute so constant
> improvement is a good thing.

There are already many competitors.

>> The Cassandra consistency model is pretty elegant and this type of
>> approach breaks that elegance in many ways. It would also only really be
>> useful when the value has a high probability of being updated between a node
>> going down and the value being read.
>
> I'm not sure what you mean.  A node can be down for days during which time
> the value can be updated.  The intention is to use the nodes available even
> if they fall below the RF.  If there is only 1 node available for accepting
> a replica, that should be enough given the conditions I stated and updated
> below.

If this is your constraint, then you should just use CL.ONE.

>> Perhaps the simpler approach which is fairly trivial and does not require
>> any Cassandra change is to simply downgrade your read from ALL to QUORUM
>> when you get an unavailable exception for this particular read.
>
> It's not so trivial, esp since you would have to build that into your client
> at many levels.  I think it would be more appropriate (if this idea
> survives) to put it into Cass.
>>
>> I think the general answerer for 'maximum consistency' is QUORUM
>> reads/writes. Based on the fact you are using CL=ALL for reads I assume you
>> are using CL=ONE for writes: this itself strikes me as a bad idea if you
>> require 'maximum consistency for one critical operation'.
>>
> Very true.  Specifying quorum for BOTH reads/writes provides the 100%
> consistency because of the overlapping of the availability numbers.  But,
> only if the # of available nodes is not < RF.

No, it will work as long as the available nodes is >= RF/2 + 1

> Upon further reflection, this idea can be used for any consistency level.
>  The general thrust of my argument is:  If a particular value can be
> overwritten by one process regardless of it's prior value, then that implies
> that the value in the down node is no longer up-to-date and can be
> disregarded.  Just work with the nodes that are available.
>
> Actually, now that I think about it...
>
> ALL_AVAIL guarantees 100% consistency iff the latest timestamp of the value
>> latest unavailability time of all unavailable replica nodes for that
> value's row key.  Unavailable is defined as a node's Cass process that is
> not reachable from ANY node in the cluster in the same data center.  If the
> node in question is available to at least one node, then the read should
> fail as there is a possibility that the value could have been updated some
> other way.

Node A can't reliably and consistently know  whether node B and node C
can communicate.

> After looking at the code, it doesn't look like it will be difficult.
>  Instead of skipping the request for values from the nodes when CL nodes
> aren't available, it would have to go ahead and request the values from the
> available nodes as usual and then look at the timestamps which it does
> anyways and compare it to the latest unavailability time of the relevant
> replica nodes.  The code that keeps track of what nodes are down simply
> records the time it went down.  But, I've only been looking at the code for
> a few days so I'm not claiming to know everything by any stretch.

-ryan

Mime
View raw message