incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AJ ...@dude.podzone.net>
Subject Re: No Transactions: An Example
Date Wed, 29 Jun 2011 22:16:04 GMT

On 6/22/2011 9:18 AM, Trevor Smith wrote:
> Right -- that's the part that I am more interested in fleshing out in 
> this post.
>

Here is one way.  Use MVCC 
<http://en.wikipedia.org/wiki/Multiversion_concurrency_control>.  A 
single global clean-up process would be acceptable since it's not a 
single point of failure, only a single point of accumulating back-logged 
work and will not affect availability as long as you are notified if 
that process terminates and restart it in a reasonable amount of time 
but this will not affect the validity of subsequent reads.

So, you would have a "balance" column.  And each update will create a 
"balance_<timestamp>" with a positive or negative value indicating a 
credit or debit.  Subsequent clients will read the latest value by doing 
a slice from "balance" to "balance_~" (i.e. all "balance*" columns).  
(You would have to work-out your column naming conventions so that your 
slices return only the pertinent columns.)  Then, the clients would have 
to apply all the credits and debits to the balance to get the current 
balance.

This handles the lost update problem.

For the dirty read and incorrect summary problems by others reading data 
that is in the middle of a transaction that hasn't committed yet, I 
would add a final transaction column to a Transactions CF.  The key 
would be <cf>.<key>.<column>, e.g., Accounts.1234.balance, 1234 being 
the account # and Accounts being the CF owning the balance column.  
Then, a new column would be added for each successful transaction (e.g., 
after debiting and crediting the two accounts) using the same timestamp 
used in balance_<timestamp>.  So, now, a client wanting the current 
balance would have to do a slice for all of the transactions for that 
column and only apply the balance updates up to the latest transaction.  
Note, you might have to do something else with the transaction naming 
schemes to make sure they are guaranteed to be unique, but you get the 
idea.  If the transaction fails, the client simply does not add a 
transaction column to Transactions and deletes any "balance_<timestamp>" 
columns it added to in the Accounts CF (or let's the clean-up process do 
it... carefully).

This should avoid the need for locks and as long as each account doesn't 
have a crazy amount of updates, the slices shouldn't be so large as to 
be a significant perf hit.

A note about the updates.  You have to make sure the clean-up process 
processes the updates in order and only 1 time.  If you can't guarantee 
these, then you'll have to make sure your updates are idempotent and 
commutative.

Oh yeah, and you must use QUORUM read/writes, of course.

Any critiques?

aj

Mime
View raw message