cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Bridges <>
Subject Re: Consistency model
Date Sat, 16 Apr 2011 15:49:22 GMT
If you are reading and writing at quorum, then what you are seeing
shouldn't happen.  You shouldn't be able to read N+1 until N+1 has
been committed to a quorum of servers.  At this point you should not
be able to read N anymore, since there is no quorum that contains N.

Dan - I think you are right, except that quorum reads should be
consistent even during a quorum write.  You are not guaranteed to read
N+1 until *after* a successful quorum write of N+1, but once you see
N+1, you should never see N again, even if the write failed.


On Fri, Apr 15, 2011 at 1:29 PM, Dan Hendry <> wrote:
> So Cassandra does not use an atomic commit protocol at the cluster level.
> Strong consistency on a quorum read is only guaranteed *after* a successful
> quorum write. The behaviour you are seeing is possible if you are reading in
> the middle of a write or the write failed (which should be reported to your
> code via an exception).
> Dan
> -----Original Message-----
> From: James Cipar []
> Sent: April-15-11 14:15
> To:
> Subject: Consistency model
> I've been experimenting with the consistency model of Cassandra, and I found
> something that seems a bit unexpected.  In my experiment, I have 2
> processes, a reader and a writer, each accessing a Cassandra cluster with a
> replication factor greater than 1.  In addition, sometimes I generate
> background traffic to simulate a busy cluster by uploading a large data file
> to another table.
> The writer executes a loop where it writes a single row that contains just
> an sequentially increasing sequence number and a timestamp.  In python this
> looks something like:
>    while time.time() < start_time + duration:
>        target_server = random.sample(servers, 1)[0]
>        target_server = '%s:9160'%target_server
>        row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
>        seqnum += 1
>        # print 'uploading to server %s, %s'%(target_server, row)
>        pool = pycassa.connect('Keyspace1', [target_server])
>        cf = pycassa.ColumnFamily(pool, 'Standard1')
>        cf.insert('foo', row, write_consistency_level=consistency_level)
>        pool.dispose()
>        if sleeptime > 0.0:
>            time.sleep(sleeptime)
> The reader simply executes a loop reading this row and reporting whenever a
> sequence number is *less* than the previous sequence number.  As expected,
> with consistency_level=ConsistencyLevel.ONE there are many inconsistencies,
> especially with a high replication factor.
> What is unexpected is that I still detect inconsistencies when it is set at
> ConsistencyLevel.QUORUM.  This is unexpected because the documentation seems
> to imply that QUORUM will give consistent results.  With background traffic
> the average difference in timestamps was 0.6s, and the maximum was >3.5s.
> This means that a client sees a version of the row, and can subsequently see
> another version of the row that is 3.5s older than the previous.
> What I imagine is happening is this, but I'd like someone who knows that
> they're talking about to tell me if it's actually the case:
> I think Cassandra is not using an atomic commit protocol to commit to the
> quorum of servers chosen when the write is made.  This means that at some
> point in the middle of the write, some subset of the quorum have seen the
> write, while others have not.  At this time, there is a quorum of servers
> that have not seen the update, so depending on which quorum the client reads
> from, it may or may not see the update.
> Of course, I understand that the client is not *choosing* a bad quorum to
> read from, it is just the first `q` servers to respond, but in this case it
> is effectively random and sometimes an bad quorum is "chosen".
> Does anyone have any other insight into what is going on here?=
> No virus found in this incoming message.
> Checked by AVG -
> Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
> 02:34:00

View raw message