cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2494) Quorum reads are not consistent
Date Fri, 22 Apr 2011 08:28:05 GMT


Peter Schuller commented on CASSANDRA-2494:

I don't think anyone is claiming otherwise, unless I'm misunderstanding. The problem is that
while the "if sucessfully written to quorum, subsequent quorum reads will see it" guarantee
is indeed maintained, it is possible for quorum reads to see data go backwards (on a timeline)
in the event of a *failed* attempted quorum write. This includes the possibility of reads
seeing data that then permanently vanishes, even though you only lost say 1 node that you
designed your cluster for surviving (RF >= 3, QUORUM). ("lost 1 node" can be substituted
with "killed 1 node in periodic commit mode")

I still don't think this is a violation of what was promised, but I can see how making the
further guarantee would make for more useful consistency semantics in some cases.

With respect to implicit write: An alternative is to adjust reconciliation logic when applied
as part of reads (as opposed to AES,  hinted hand-off, writes) to take consistency level into
account and only consider columns whose timestamp is >= the greatest timestamp that has
quorum (off the top of my head I think that should be correct in call cases, but I didn't
think this through terribly).

> Quorum reads are not consistent
> -------------------------------
>                 Key: CASSANDRA-2494
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sean Bridges
> As discussed in this thread,
> Quorum reads should be consistent.  Assume we have a cluster of 3 nodes (X,Y,Z) and a
replication factor of 3. If a write of N is committed to X, but not Y and Z, then a read from
X should not return N unless the read is committed to at  least two nodes.  To ensure this,
a read from X should wait for an ack of the read repair write from either Y or Z before returning.
> Are there system tests for cassandra?  If so, there should be a test similar to the original
post in the email thread.  One thread should write 1,2,3... at consistency level ONE.  Another
thread should read at consistency level QUORUM from a random host, and verify that each read
is >= the last read.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message