I'm using .NET and I wrote my own client library (over Thrift) so I'm absolutely sure that both operations are performed using the same connection.
I can handle the current issue in application but I'm sure that I will not be able to handle some future situation in application.

So the suggestion is to use at least 3 nodes with RF=3 and CL.QUORUM for both write and reads where high consistency is required, right?


2011/2/12 Dan Hendry <dan.hendry.junk@gmail.com>

Are you using a higher level client (hector/pelops/pycassa/etc) or the actual thrift API? Higher level clients often pool connections and two subsequent operations (read then write) may be performed with connections to different nodes.


If you are sure you are using the same connection (the actual thrift api), there is a possible race condition. To the best of my understanding, here is how a write happens at cl ONE in your case :

-          You make a request to node A which initiates a write to node A and B

-          The server reports successful when the write to node A OR B is complete (can somebody else confirm?)


Typically the write to A will complete quicker since that is the node you are connected to and there is additional network overhead initiating the write on node B. I suppose a 1:1000 chance of B completing first is possible, particularly if all nodes and the client are on the same network (or same machine) with very low latencies.


Cassandra allows you to explicitly specify the trade-off between consistency and availability. When you read and write at ONE with RF=2, consistency is not guaranteed but high availability is (you can lose a node and continue to operate). If you require strong consistency you will either have to read or write at consistency level ALL. My suggestion is to either design your application to tolerate inconsistency (if possible) or move to RF=3 and quorum read and quorum writes.




From: Michal Augustın [mailto:augustyn.michal@gmail.com]
Sent: February-12-11 4:13
To: user@cassandra.apache.org
Subject: per-connection "read-after-my-write" consistency




I'm running 2 nodes with RF=2 (not optimal, I know), Cassandra 0.7.1.


During one connection, I write (CL.ONE) a row and subsequently read (CL.ONE) the same row (via Thrift).

I supposed that if I write row to one node then I can immediately read this row from this node.

It seems to be true for most cases, but circa 1 of 1000 attempts doesn't work as expected - I get no row :(


Where is the problem please? Should I use another CL for read and/or write? I would like just to achieve "per connection read-after-my-write consistency".


Thank you very much!



