cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan King <r...@twitter.com>
Subject Re: Propose new ConsistencyLevel.ALL_AVAIL for reads
Date Thu, 16 Jun 2011 23:50:36 GMT
On Thu, Jun 16, 2011 at 2:12 PM, AJ <aj@dude.podzone.net> wrote:
> On 6/16/2011 2:37 PM, Ryan King wrote:
>>
>> On Thu, Jun 16, 2011 at 1:05 PM, AJ<aj@dude.podzone.net>  wrote:
>
>> <snip>
>>>>
>>>> The Cassandra consistency model is pretty elegant and this type of
>>>> approach breaks that elegance in many ways. It would also only really be
>>>> useful when the value has a high probability of being updated between a
>>>> node
>>>> going down and the value being read.
>>>
>>> I'm not sure what you mean.  A node can be down for days during which
>>> time
>>> the value can be updated.  The intention is to use the nodes available
>>> even
>>> if they fall below the RF.  If there is only 1 node available for
>>> accepting
>>> a replica, that should be enough given the conditions I stated and
>>> updated
>>> below.
>>
>> If this is your constraint, then you should just use CL.ONE.
>>
> My constraint is a CL = "All Available".  So, CL.ONE will not work.

That's a solution, not a requirement. What's your requirement?

>>>>
>>>> Perhaps the simpler approach which is fairly trivial and does not
>>>> require
>>>> any Cassandra change is to simply downgrade your read from ALL to QUORUM
>>>> when you get an unavailable exception for this particular read.
>>>
>>> It's not so trivial, esp since you would have to build that into your
>>> client
>>> at many levels.  I think it would be more appropriate (if this idea
>>> survives) to put it into Cass.
>>>>
>>>> I think the general answerer for 'maximum consistency' is QUORUM
>>>> reads/writes. Based on the fact you are using CL=ALL for reads I assume
>>>> you
>>>> are using CL=ONE for writes: this itself strikes me as a bad idea if you
>>>> require 'maximum consistency for one critical operation'.
>>>>
>>> Very true.  Specifying quorum for BOTH reads/writes provides the 100%
>>> consistency because of the overlapping of the availability numbers.  But,
>>> only if the # of available nodes is not<  RF.
>>
>> No, it will work as long as the available nodes is>= RF/2 + 1
>
> Yes, that's what I meant.  Sorry for any confusion.  Restated: But, only if
> the # of available nodes is not < RF/2 + 1.
>>>
>>> Upon further reflection, this idea can be used for any consistency level.
>>>  The general thrust of my argument is:  If a particular value can be
>>> overwritten by one process regardless of it's prior value, then that
>>> implies
>>> that the value in the down node is no longer up-to-date and can be
>>> disregarded.  Just work with the nodes that are available.
>>>
>>> Actually, now that I think about it...
>>>
>>> ALL_AVAIL guarantees 100% consistency iff the latest timestamp of the
>>> value
>>>>
>>>> latest unavailability time of all unavailable replica nodes for that
>>>
>>> value's row key.  Unavailable is defined as a node's Cass process that is
>>> not reachable from ANY node in the cluster in the same data center.  If
>>> the
>>> node in question is available to at least one node, then the read should
>>> fail as there is a possibility that the value could have been updated
>>> some
>>> other way.
>>
>> Node A can't reliably and consistently know  whether node B and node C
>> can communicate.
>
> Well, theoretically, of course; that's the nature of distributed systems.
>  But, Cass does indeed make that determination when it counts the number
> available replica nodes before it decides if enough replica nodes are
> available.  But, this is obvious to you I'm sure so maybe I don't understand
> your statement.

Consider this scenario: given nodes, A, B and C and A thinks C is down
but B thinks C is up. What do you do? Remember, A doesn't know that B
thinks C is up, it only knows its own state.

-ryan

Mime
View raw message