cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: Cassandra ACID
Date Fri, 24 Jun 2011 09:49:54 GMT
On Fri, Jun 24, 2011 at 9:11 AM, Peter Schuller
<peter.schuller@infidyne.com> wrote:
>> Atomicity
>> All individual writes are atomic at the row level.  So, a batch mutate for
>> one specific key will apply updates to all the columns for that one specific
>> row atomically.  If part of the single-key batch update fails, then all of
>> the updates will be reverted since they all pertained to one key/row.
>> Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation
>> are related to the topic of transactions but one does not imply the other.
>> Even though row updates are atomic, they are not isolated from other users'
>> updates or reads.
>> Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic
>
> Atomicity is sort of provided, but there's no reversion going on.
> Cassandra does validation of batch mutations prior to their
> application and then tries to apply it. In the absence of bugs in
> Cassandra, it should generally be safe to say that writes are then
> guaranteed to succeed. However I wouldn't necessarily rely on this
> type of atomicity to the same level that I would in e.g. PostgreSQL.
>
> One example of violated atomicity is when you run with periodic commit
> log mode instead of batch wise. If you for example perform a write on
> CL.ONE but the node that took the write got killed (eg SIGKILL) before
> the periodic commit log flush, you will have eaten a write that then
> gets dropped. If someone read the changes that the write entails, the
> application-visible behavior will be that the write will be "undone"
> rather than eventually done.

I will disagree with the "atomicity is sort of provided". I think your violation
example is a violation of durability, not atomicity (a.k.a indivisibility).

We do always provide atomicity of updates in the same batch_mutate call
under a given key. Which means that for a given key, all update of the batch
will be applied, or none of them. This is *always* true and this does not depend
on the commit log (and granted, if the write timeout, you won't know which one
it is, but you are still guaranteed that it is either all or none).

That being said, we do not provide isolation, which means in particular that
reads *can* return a state where only parts of a batch update seems applied
(and it would clearly be cool to have isolation and I'm not even
saying this will
never happen). But atomicity guarantee you that even though you may observe
such a state (and honestly the window during which you can is uber small),
eventually you will observe that all have been applied (or none if you're in the
business of questioning durability (see below) but never "part of").

As for durability, it is true that in periodic commit log mode, durability on a
single node is subject to a small window of time. But true, serious durability
in the real world really only come from replication, and that's why we
use periodic
mode for the commit log by default (and you can always switch to batch if you
so wish). Which is not to say that Peter statement is technically wrong, but if
what we're doing is assess Cassandra durability, I'll argue that because it does
replication well (including across data center) while still having
strong single-node
durability guarantee, it has among the best durability story out there
(even with
periodic commit log).


--
Sylvain



>> Consistency
>> If you want 100% consistency, use consistency level QUORUM for both reads
>> and writes and EACH_QUORUM in a multi-dc scenario.
>> Refs: http://wiki.apache.org/cassandra/ArchitectureOverview
>
> For the limited definition of consistency it provides, yes. One thing
> to be aware of is that *failed* writes at QUORUM followed by
> *succeeding* reads at QUORUM may have readers see inconsistent results
> across requests (see
> https://issues.apache.org/jira/browse/CASSANDRA-2494 although I still
> think it's a designed-for behavior rather than a bug). And of course
> the usual bits about concurrent updates and updates spanning multiple
> rows.
>
> I'm just a bit hesitant to agree to the term "100% consistency" since
> it sounds very all-encompassing :)
>
>> Isolation
>> NOTHING is isolated; because there is no transaction support in the first
>> place.  This means that two or more clients can update the same row at the
>> same time.  Their updates of the same or different columns may be
>> interleaved and leave the row in a state that may not make sense depending
>> on your application.  Note: this doesn't mean to say that two updates of the
>> same column will be corrupted, obviously; columns are the smallest atomic
>> unit ('atomic' in the more general thread-safe context).
>> Refs: None that directly address this explicitly and clearly and in one
>> place.
>
> Yes but the relevant lack of isolation is for reads. Due to
> Cassandra's conflict resolution model, given two updates with certain
> timestamps associated with them, the actual timing of the writes will
> not change the eventual result in the data (absent read-before-write
> logic operating on that data concurrently).
>
> The lack of isolation is thus mostly of concern to readers.
>
>> Durability
>> Updates are made durable by the use of the commit log.  No worries here.
>
> But be careful about choosing batch commit log sync instead of
> periodic if single-node durability or post-quorum-write durability is
> a concern.
>
> --
> / Peter Schuller
>

Mime
View raw message