incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony John <chirayit...@gmail.com>
Subject Re: New Chain for : Does Cassandra use vector clocks
Date Thu, 24 Feb 2011 17:33:24 GMT
Completely understand!

All that I am quibbling over is whether a CL of quorum guarantees
consistency or not. That is what the documentation says - right. IF for a CL
of Q read - it depends on which node returns read first to determine the
actual returned result or other more convoluted conditions , then a Quorum
read/write is not consistent, by any definition.

I can still use Cassandra, and will use it, luv it!!! But let us not make
this statement on the Wiki architecture section:-

-------------------------------------------------------------

More specifically: R=read replica count W=write replica count N=replication
factor Q=*QUORUM* (Q = N / 2 + 1)

   -

   If W + R > N, you will have consistency
   - W=1, R=N
   - W=N, R=1
   - W=Q, R=Q where Q = N / 2 + 1

Cassandra provides consistency when R + W > N (read replica count + write
replica count > replication factor).

----------------------------------------------------


.

On Thu, Feb 24, 2011 at 11:22 AM, Sylvain Lebresne <sylvain@datastax.com>wrote:

> On Thu, Feb 24, 2011 at 6:01 PM, Anthony John <chirayithaj@gmail.com>wrote:
>
>> If you are correct and you are probably closer to the code - then CL of
>> Quorum does not guarantee a consistency.
>
>
> If the operation succeed, it does (for some definition of consistency which
> is, following reads at Quorum will be guaranteed to see the new value of a
> update at quorum). If it fails, then no, it does not guarantee consistency.
>
> It is important to note that the word consistency has multiple meaning. In
> particular, when we are talking of consistency in Cassandra, we are not
> talking of the same definition as the C in ACID (see:
> http://www.allthingsdistributed.com/2007/12/eventually_consistent.html)
>
>>
>> On Thu, Feb 24, 2011 at 10:54 AM, Sylvain Lebresne <sylvain@datastax.com>wrote:
>>
>>> On Thu, Feb 24, 2011 at 5:34 PM, Anthony John <chirayithaj@gmail.com>wrote:
>>>
>>>>  >>Time stamps are not used for conflict resolution - unless is is
part
>>>>> of the application logic!!!
>>>>>
>>>>
>>>> >>What is you definition of conflict resolution ? Because if you update
>>>> twice the same column (which
>>>> >>I'll call a conflict), then the timestamps are used to decide which
>>>> update wins (which I'll call a resolution).
>>>>
>>>> I understand what you are saying, and yes semantics is very important
>>>> here. And yes we are responding to the immediate questions without covering
>>>> all questions in the thread.
>>>>
>>>> The point being made here is that the timestamp of the column is not
>>>> used by Cassandra to figure out what data to return.
>>>>
>>>
>>> Not quite true.
>>>
>>>
>>>> E.g. - Quorum is 2 nodes - and RF of 3 over N1/2/3
>>>> A Quorum  Write comes and add/updates the time stamp (TS2) of a
>>>> particular data element. It succeeds on N1 - fails on N2/3. So the write
is
>>>> returned as failed - right ?
>>>> Now Quorum read comes in for exactly the same piece of data that the
>>>> write failed for.
>>>> So N1 has TS2 but both N2/3 have the old TS (say TS1)
>>>> And the read succeeds - Will it return TS1 or TS2.
>>>>
>>>> I submit it will return TS1 - the old TS.
>>>>
>>>
>>> It all depends on which (first 2) nodes respond to the read (since RF=3,
>>> that can any two of N1/N2/N3). If N1 is part of the two that makes the
>>> quorum, then TS2 will be returned, because cassandra will compare the
>>> timestamp and decide what to return based on this. If N2/N3 responds
>>> however, both timestamp will be TS1 and so, after timestamp resolution, it
>>> will stil be TS1 that will be returned.
>>> So yes timestamp is used for conflict resolution.
>>>
>>> In your example, you could get TS1 back because a failed write can let
>>> you cluster in an inconsistent state. You'd have to retry the quorum and
>>> only when it succeeds can you be guaranteed that quorum read will always
>>> return TS2.
>>>
>>> This is because when a write fails, Cassandra doesn't guarantee that the
>>> write did not made it in (there is no revert).
>>>
>>>
>>>>
>>>> Are we on the same page with this interpretation ?
>>>>
>>>> Regards,
>>>>
>>>> -JA
>>>>
>>>> On Thu, Feb 24, 2011 at 10:12 AM, Sylvain Lebresne <
>>>> sylvain@datastax.com> wrote:
>>>>
>>>>> On Thu, Feb 24, 2011 at 4:52 PM, Anthony John <chirayithaj@gmail.com>wrote:
>>>>>
>>>>>> Sylvan,
>>>>>>
>>>>>> Time stamps are not used for conflict resolution - unless is is part
>>>>>> of the application logic!!!
>>>>>>
>>>>>
>>>>> What is you definition of conflict resolution ? Because if you update
>>>>> twice the same column (which
>>>>> I'll call a conflict), then the timestamps are used to decide which
>>>>> update wins (which I'll call a resolution).
>>>>>
>>>>>
>>>>>> You can have "lost updates" w/Cassandra. You need to to use 3rd
>>>>>> products - cages for e.g. - to get ACID type consistency.
>>>>>>
>>>>>
>>>>> Then again, you'll have to define what you are calling "lost updates".
>>>>> Provided you use a reasonable consistency level, Cassandra provides fairly
>>>>> strong durability guarantee, so for some definition you don't "lose
>>>>> updates".
>>>>>
>>>>> That being said, I never pretended that Cassandra provided any ACID
>>>>> guarantee. ACID relates to transaction, which Cassandra doesn't support.
If
>>>>> we're talking about the guarantees of transaction, then by all means,
>>>>> cassandra won't provide it. And yes you can use cages or the like to
get
>>>>> transaction. But that was not the point of the thread, was it ? The thread
>>>>> is about vector clocks, and that has nothing to do with transaction (vector
>>>>> clocks certainly don't give you transactions).
>>>>>
>>>>> Sorry if I wasn't clear in my mail, but I was only responding to why
so
>>>>> far I don't think vector clocks would really provide much for Cassandra.
>>>>>
>>>>> --
>>>>> Sylvain
>>>>>
>>>>>
>>>>>> -JA
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 24, 2011 at 7:41 AM, Sylvain Lebresne <
>>>>>> sylvain@datastax.com> wrote:
>>>>>>
>>>>>>> On Thu, Feb 24, 2011 at 3:22 AM, Anthony John <chirayithaj@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Apologies : For some reason my response on the original mail
keeps
>>>>>>>> bouncing back, thus this new one!
>>>>>>>> > From the other hand, the same article says:
>>>>>>>> > "For conditional writes to work, the condition must
be evaluated
>>>>>>>> at all update
>>>>>>>> > sites before the write can be allowed to succeed."
>>>>>>>> >
>>>>>>>> > This means, that when doing such an update CL=ALL must
be used
>>>>>>>>
>>>>>>>> Sorry, but I am confused by that entire thread!
>>>>>>>>
>>>>>>>> Questions:-
>>>>>>>> 1. Does Cassandra implement any kind of data locking - at
any
>>>>>>>> granularity whether it be row/colF/Col ?
>>>>>>>>
>>>>>>>
>>>>>>> No locking, no.
>>>>>>>
>>>>>>>
>>>>>>>> 2. If the answer to 1 above is NO! - how does CL ALL prevent
>>>>>>>> conflicts. Concurrent updates on exactly the same piece of
data on different
>>>>>>>> nodes can still mess each other up, right ?
>>>>>>>>
>>>>>>>
>>>>>>> Not sure why you are taking CL.ALL specifically. But in any CL,
>>>>>>> updating the same piece of data means the same column value.
In that case,
>>>>>>> the resolution rules are the following:
>>>>>>>    - If the updates have a different timestamp, keep the one
with the
>>>>>>> higher timestamp. That is, the more recent of two updates win.
>>>>>>>   - It the timestamps are the same, then it compares the values
(byte
>>>>>>> comparison) and keep the highest value. This is just to break
ties in a
>>>>>>> consistent manner.
>>>>>>>
>>>>>>> So if you do two truly concurrent updates (that is from two place
at
>>>>>>> the same instant), then you'll end with one of the update. This
is the
>>>>>>> column level.
>>>>>>>
>>>>>>> However, if that simple conflict detection/resolution mechanism
is
>>>>>>> not good enough for some of your use case and you need to keep
two
>>>>>>> concurrent updates, it is easy enough. Just make sure that the
update don't
>>>>>>> end up in the same column. This is easily achieved by appending
some unique
>>>>>>> identifier to the column name for instance. And when reading,
do a slice and
>>>>>>> reconcile whatever you get back with whatever logic make sense.
If you do
>>>>>>> that, congrats, you've roughly emulated what vector clocks would
do. Btw, no
>>>>>>> locking or anything needed.
>>>>>>>
>>>>>>> In my experience, for most things the timestamp resolution is
enough.
>>>>>>> If the same user update twice it's profile picture on you web
site at the
>>>>>>> same microsecond, it's usually fine to end up with one of the
two pictures.
>>>>>>> In the rare case where you need something more specific, using
the cassandra
>>>>>>> data model usually solves the problem easily. The reason for
not having
>>>>>>> vector clocks in Cassandra is that so far, we haven't really
found much
>>>>>>> example where it is no the case.
>>>>>>>
>>>>>>> --
>>>>>>> Sylvain
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message