incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Revell <d...@meebo-inc.com>
Subject Re: New Chain for : Does Cassandra use vector clocks
Date Thu, 24 Feb 2011 16:59:18 GMT
>Time stamps are not used for conflict resolution - unless is is part of the
application logic!!!

This is false. In fact, the main reason Cassandra keeps timestamps is to do
conflict resolution. If there is a conflict between two replicas, when doing
a read or a repair, then the highest timestamp always wins.

Example: say your replication factor is 5. So if you read at CL ALL, you
will ask 5 replicas for their value. If the value from only one of these
replicas has a timestamp that is newer than all the rest, this is the value
that will be retruned to the client. There is no "voting" scheme where the
most common value wins, the conflict resolution is based ONLY on the most
recent timestamp.

(irrelevant aside: in the above example, read repair would occur at the end,
after the different values were detected by the coordinating server)

Clients are free to use the timestamps for their own purposes, but clients
must be careful to choose timestamps that make Cassandra do the right thing
during conflict resolution.

Best,
Dave

On Thu, Feb 24, 2011 at 8:34 AM, Anthony John <chirayithaj@gmail.com> wrote:

> >>Time stamps are not used for conflict resolution - unless is is part of
>> the application logic!!!
>>
>
> >>What is you definition of conflict resolution ? Because if you update
> twice the same column (which
> >>I'll call a conflict), then the timestamps are used to decide which
> update wins (which I'll call a resolution).
>
> I understand what you are saying, and yes semantics is very important here.
> And yes we are responding to the immediate questions without covering all
> questions in the thread.
>
> The point being made here is that the timestamp of the column is not used
> by Cassandra to figure out what data to return.
>
> E.g. - Quorum is 2 nodes - and RF of 3 over N1/2/3
> A Quorum  Write comes and add/updates the time stamp (TS2) of a particular
> data element. It succeeds on N1 - fails on N2/3. So the write is returned as
> failed - right ?
> Now Quorum read comes in for exactly the same piece of data that the write
> failed for.
> So N1 has TS2 but both N2/3 have the old TS (say TS1)
> And the read succeeds - Will it return TS1 or TS2.
>
> I submit it will return TS1 - the old TS.
>
> Are we on the same page with this interpretation ?
>
> Regards,
>
> -JA
>
> On Thu, Feb 24, 2011 at 10:12 AM, Sylvain Lebresne <sylvain@datastax.com>wrote:
>
>> On Thu, Feb 24, 2011 at 4:52 PM, Anthony John <chirayithaj@gmail.com>wrote:
>>
>>> Sylvan,
>>>
>>> Time stamps are not used for conflict resolution - unless is is part of
>>> the application logic!!!
>>>
>>
>> What is you definition of conflict resolution ? Because if you update
>> twice the same column (which
>> I'll call a conflict), then the timestamps are used to decide which update
>> wins (which I'll call a resolution).
>>
>>
>>> You can have "lost updates" w/Cassandra. You need to to use 3rd products
>>> - cages for e.g. - to get ACID type consistency.
>>>
>>
>> Then again, you'll have to define what you are calling "lost updates".
>> Provided you use a reasonable consistency level, Cassandra provides fairly
>> strong durability guarantee, so for some definition you don't "lose
>> updates".
>>
>> That being said, I never pretended that Cassandra provided any ACID
>> guarantee. ACID relates to transaction, which Cassandra doesn't support. If
>> we're talking about the guarantees of transaction, then by all means,
>> cassandra won't provide it. And yes you can use cages or the like to get
>> transaction. But that was not the point of the thread, was it ? The thread
>> is about vector clocks, and that has nothing to do with transaction (vector
>> clocks certainly don't give you transactions).
>>
>> Sorry if I wasn't clear in my mail, but I was only responding to why so
>> far I don't think vector clocks would really provide much for Cassandra.
>>
>> --
>> Sylvain
>>
>>
>>> -JA
>>>
>>>
>>> On Thu, Feb 24, 2011 at 7:41 AM, Sylvain Lebresne <sylvain@datastax.com>wrote:
>>>
>>>> On Thu, Feb 24, 2011 at 3:22 AM, Anthony John <chirayithaj@gmail.com>wrote:
>>>>
>>>>> Apologies : For some reason my response on the original mail keeps
>>>>> bouncing back, thus this new one!
>>>>> > From the other hand, the same article says:
>>>>> > "For conditional writes to work, the condition must be evaluated
at
>>>>> all update
>>>>> > sites before the write can be allowed to succeed."
>>>>> >
>>>>> > This means, that when doing such an update CL=ALL must be used
>>>>>
>>>>> Sorry, but I am confused by that entire thread!
>>>>>
>>>>> Questions:-
>>>>> 1. Does Cassandra implement any kind of data locking - at any
>>>>> granularity whether it be row/colF/Col ?
>>>>>
>>>>
>>>> No locking, no.
>>>>
>>>>
>>>>> 2. If the answer to 1 above is NO! - how does CL ALL prevent conflicts.
>>>>> Concurrent updates on exactly the same piece of data on different nodes
can
>>>>> still mess each other up, right ?
>>>>>
>>>>
>>>> Not sure why you are taking CL.ALL specifically. But in any CL, updating
>>>> the same piece of data means the same column value. In that case, the
>>>> resolution rules are the following:
>>>>    - If the updates have a different timestamp, keep the one with the
>>>> higher timestamp. That is, the more recent of two updates win.
>>>>   - It the timestamps are the same, then it compares the values (byte
>>>> comparison) and keep the highest value. This is just to break ties in a
>>>> consistent manner.
>>>>
>>>> So if you do two truly concurrent updates (that is from two place at the
>>>> same instant), then you'll end with one of the update. This is the column
>>>> level.
>>>>
>>>> However, if that simple conflict detection/resolution mechanism is not
>>>> good enough for some of your use case and you need to keep two concurrent
>>>> updates, it is easy enough. Just make sure that the update don't end up in
>>>> the same column. This is easily achieved by appending some unique identifier
>>>> to the column name for instance. And when reading, do a slice and reconcile
>>>> whatever you get back with whatever logic make sense. If you do that,
>>>> congrats, you've roughly emulated what vector clocks would do. Btw, no
>>>> locking or anything needed.
>>>>
>>>> In my experience, for most things the timestamp resolution is enough. If
>>>> the same user update twice it's profile picture on you web site at the same
>>>> microsecond, it's usually fine to end up with one of the two pictures. In
>>>> the rare case where you need something more specific, using the cassandra
>>>> data model usually solves the problem easily. The reason for not having
>>>> vector clocks in Cassandra is that so far, we haven't really found much
>>>> example where it is no the case.
>>>>
>>>> --
>>>> Sylvain
>>>>
>>>>
>>>
>>
>

Mime
View raw message