incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Jordan <jeremiah.jor...@morningstar.com>
Subject Re: Newbie question about writer/reader consistency
Date Fri, 30 Dec 2011 04:15:23 GMT
So you can do this with Cassandra, but you need more logic in your code.  Basically, you get
the last safe number, M, then get N..M, if there are any gaps, you try again reading those
numbers.  As long as you are not over writing data, and you only update the last safe number
after a successful write to Cassandra, you can do this.  We currently do something very similar
to this for some of our data.

-Jeremiah


On Dec 26, 2011, at 12:38 PM, Vladimir Mosgalin wrote:

> Hello everybody.
> 
> I am developer of financial-related application, and I'm currently evaluating
> various nosql databases for our current goal: storing various views which show
> state of the system in different aspects after each transaction.
> 
> The write load seems to be bigger than typical SQL database would handle
> without problems - under test load of tens of transactions per second, each
> transaction generates changes in dozen of views, which generates hundreds
> messages per second total. Each message ("change") for each view must be
> stored, as well as resulting view (generated as kind-of update of old view); it
> means multiple inserts & updates per message which go as single transaction. I
> started to look into nosql databases. I'm a bit puzzled by guarantees of
> atomicity and isolation that Cassandra provides, so my question will be about
> how to (if possible at all) attain required level of consistency in Cassandra.
> I've read various documents and introductions into Cassandra's data model but
> still can't understands basics about data consistency.  This discussion
> http://stackoverflow.com/questions/6033888/cassandra-atomicity-isolation-of-column-updates-on-a-single-row-on-on-single-n
> makes me feel disappointed about consistency in Cassandra, but I wonder is
> there is a way to work around it.
> 
> The requirements are like this. There is one writer, which modifies two
> "tables" (I'm sorry for using "SQL" terms, I just don't want to create
> more confusion for mapping them into Cassandra terms at this stage). For
> the first table, it's a simple insert; index is unique SCN which is
> guaranteed to be larger than previous one.
> 
> Let's say it inserts
> SCN DATA
> 1   AAA
> 2   BBB
> 3   CCC
> 
> The goal for the client (reader) is to get all the data from scn N to scn M
> without gaps. It is fine if it can't see the very latest SCN yet, that is, gets
> "1:AAA" and "2:BBB" on request "SCN: 1..END"; what is NOT fine is to get
> something "1:AAA" and "3:CCC". In other words, does Cassandra provide
> consistency between writer and reader regarding the order of changes? Or under
> some conditions (say, very fast writes - but always from single writer - and
> many concurrent reads or something) it might be possible to get that kind of gap?
> 
> The second question is similar, but on bigger scale. The second table must be
> modified in more complicated way; both insert and update of old data are
> required. Sometimes it's few insert and few updates, which must be done
> atomically - under no conditions reader should be able to see the mid-state of
> these inserts/updates. Fortunately, all these new changes will have a new key
> (new SCNs), so if it would be just possible to use a column in separate table
> which stores "last safe SCN" it would work - but I have no faith that Cassandra
> offers such level of consistency. In example, let's say it works like this
> 
> current last safe SCN: 1000
> 
> update (must be viewed as an atomic "transaction"):
> SCN   DATA
> 1001  AAA
> 1002  BBB
> 800   1001
> 1003  DDD
> 
> new last safe SCN: 1003
> 
> Here, readers need a mean to filter out lines with SCN>1000 until the writer is
> done writing "1003:DDD" line. They also need to filter out "800:1001" line
> because it references SCN which is after current "last safe" one.
> 
> "last safe SCN" is stored somewhere, and for this pattern to work I once again
> need "execution order" consistency - no reader should ever see "last safe:
> 1003" line before all the previous lines were commited; and any reader who saw
> "last safe: 1003" line must be able to see all the lines from that update just
> like they are right now.
> 
> Is this possible to do in Cassandra?
> 


Mime
View raw message