incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremiah Jordan" <JEREMIAH.JOR...@morningstar.com>
Subject RE: Write everywhere, read anywhere
Date Thu, 04 Aug 2011 17:25:52 GMT
If you have RF=3 quorum won't fail with one node down.  So R/W quorum will be consistent in
the case of one node down.  If two nodes go down at the same time, then you can get inconsistent
data from quorum write/read if the write fails with TimeOut, the nodes come back up, and then
read asks the two nodes that were down what the value is.  And another read asks the node
that was up, and a node that was down.  Those two reads will get different answers.

 

From: Mike Malone [mailto:mike@simplegeo.com] 
Sent: Thursday, August 04, 2011 12:16 PM
To: user@cassandra.apache.org
Subject: Re: Write everywhere, read anywhere

 

 

2011/8/3 Patricio Echag├╝e <patricioe@gmail.com>

 

On Wed, Aug 3, 2011 at 4:00 PM, Philippe <watcherfr@gmail.com> wrote:

Hello,

I have a 3-node, RF=3, cluster configured to write at CL.ALL and read at CL.ONE. When I take
one of the nodes down, writes fail which is what I expect.

When I run a repair, I see data being streamed from those column families... that I didn't
expect. How can the nodes diverge ? Does this mean that reading at CL.ONE may return inconsistent
data ?

 

we abort the mutation before hand when there are enough replicas alive. If a mutation went
through and in the middle of it a replica goes down, in that case you can write to some nodes
and the request will Timeout.

In that case the CL.ONE may return inconsistence data. 

 

Doesn't CL.QUORUM suffer from the same problem? There's no isolation or rollback with CL.QUORUM
either. So if I do a quorum write with RF=3 and it fails after hitting a single node, a subsequent
quorum read could return the old data (if it hits the two nodes that didn't receive the write)
or the new data that failed mid-write (if it hits the node that did receive the write).

 

Basically, the scenarios where CL.ALL + CL.ONE results in a read of inconsistent data could
also cause a CL.QUORUM write followed by a CL.QUORUM read to return inconsistent data. Right?
The problem (if there is one) is that even in the quorum case columns with the most recent
timestamp win during repair resolution, not columns that have quorum consensus.

 

Mike


Mime
View raw message