incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samuel Carrière <>
Subject Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?
Date Mon, 22 Nov 2010 12:08:45 GMT
>Cassandra can work in a consistent way, see some of this discussion and the Consistency
section here
>If you always read and write with CL.Quorum (or the other way discussed) you will have
consistency. Even if some of the replicas are temporarily inconsistent, or off line or whatever.
Your reads will >be consistent, i.e. every client will get the same value or the read will
not work. If you want to work at a lower or higher consistency you can.
>Eventually all replicas of a value will become consistent.
>There are a number of reasons why cassandra may not be a good fit, and I would guess something
else would be a problem before the consistency model.
>Hope that helps.


I like cassandra a lot and I'm sure it can be used in many use cases,
but I'm not sure we can say that we have strong consistency,
even if we read and write with CL.Quorum.

Firstly, we can only expect consistency at the column level. Reading
and writing with CL.Quorum gives you most of the time
a consistent value for each individual column, but it does not mean if
gives you a consistent view of your data.
(Because cassandra gives you no isolation and no transactions, your
application has to deal with data inconsistencies).

Secondly, I may be wrong, but I'm not sure consistency at the column
level is guaranteed. Here is an example, with a replication
factor of 3.
Imagine that the current value of col1 is 11. Your application tries
to write "col1 = 12" with CL.Quorum.
Imagine the write arrives to node 1, but that the new value is not
transmitted to nodes 2 and 3 because of network failures. So
the write fails (this is the expected behaviour), but node 1 still has
the new value (there is no rollback).

Then, imagine that the network is back to normal, and that another
client asked for the value of col1, with CL.Quorum. Here,
the value of the response is not guaranteed. If the client asks for
the value to node 2 and node 3, the response will be 11, but
if he asks to node 1 and node 2 or 3, the response will be 12.

Am I missing something ?


View raw message