incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Yiu <bigcontentf...@gmail.com>
Subject Re: Understanding atomicity in Cassandra
Date Tue, 20 Jul 2010 23:03:53 GMT
Hi, Patricio,

It's hard to comment on your original questions without knowing details of
your own domain specific data model and data processing expectation.

W.R.T. lumping things into one big row, there is a limitation on data model
in Cassandra. You got CF and SCF. That is, you have only 2 level of nesting
at most for an atomic value update.  I.e. you cannot lump arbitrarily
complex data into a single big row.

Even as the update for one particular row is atomic, you would run into the
situation of having concurrent read-write operations that conflict with each
other.

For example, having a list of values as one of your column value.
Old value is: "a, b, c"
And, the operation is: you want to add "d" to that list.
The desired new value is: "a, b, c, d"
If there is another concurrent operation that tries to add "e" to the list,
you would still have problem given the present atomic semantic of row update
in cassandra.

On the other hand, there are a number of application scenario, where update
operations are safe to be considered as idempotent.
E.g. bulk loading data from flat files into Cassandra

If your main worry is about client process crashing, regardless what kind of
ACID properties that Cassandra can provide, you still want to have a way to
verify whether Cassandra has stored the desired state and/or log the
processed update operation in the context of bulk loading. Then, you can
decide whether a particular data update needs to be repeated or not. A full
fledge ACID database ("all or nothing semantic") can decrease the complexity
of verification of the succeed of storage. But, it cannot remove that
concern completely. Consider the case that the client process crashes right
at the moment of "dbConn.commit()". You still don't know for sure whether
that update operation has gone through.

Hope this email helps.

Thanks!


Regards,
Alex Yiu



On Tue, Jul 20, 2010 at 2:03 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> 2010/7/20 Patricio Echag├╝e <patricioe@gmail.com>:
> > Would it be bad design to store all the data that need to be
> > consistent under one big key?
>
> That really depends how unnatural it is from a query perspective. :)
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Mime
View raw message