cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Anastasyev <olega...@gmail.com>
Subject Re: Usage Pattern : &amp;quot;unique&amp;quot; value of a key.
Date Fri, 14 Jan 2011 09:21:27 GMT
> 
> You're right when you say it's unlikely that 2 threads have the same
> timestamp, but it can. So it could work for user creation, but maybe
> not on a more write intensive problem.

Um, sorry I thought you re solving exact case of duplicate user creation. If 
youre trying to solve the concurrent updates to cassandra in general, consider 
using zookeeper. By the way, lock algorithm in zookeeper is very much like you 
descibed - but zookeeper is the right tool for this job.

> 
> Moreover, we cannot rely on fully time synchronized node in the
> cluster (but on node synchronized at a few ms), so a second node could
> theoretically write a smaller timestamp after the first node.

This is not a problem - then this node will loose the race - cassandra will 
ignore updates with timestamp older then timestamp of the current value.

> An even worst case could be the one illustrated here
> (http://noisette.ch/cassandra/cassandra_unique_key_pattern.png) :
> nodes are synchronized, but something goes wrong (slow) during the
> write, then both nodes think the key belongs to them.
> So my idea of writing a lock is not well suitablte following modification - 
when either user performs write{K,lock A}, it passes timestamp, recorded earlier 
- at the moment of performing very 1st read K.

So the scenario for user A is:
1. record current timestamp from machine clock -> T1
2. make read K, K not exists
3. make write{K, lock A, timestamp = T1}
3.1 cassandra sees no current value in memtable for K -> write succeeds. 
cassanda records timestamp of the value K,A to be T1
4. read K, compare lock to be A (for your original solution) or returned data 
timestamp == T1 (for proposed by me)

Then user B scenario would be:
1. record current timestamp from machine clock. It's value is T0, which is <T1.
2. make read K, K not exists
3. slowness on: make a pause for couple of (milli)seconds, GCing or drinking 
coffee, so user A executes its scenario above
4. slowness off: make write{K,lock B, timestamp = T0}
4.1 on cassandra side, this write will be ignored, becase current timestamp of K 
is T1, which is later than T0
5. read K, see lock == A, instead of B (in your original solution) or timestamp 
!= T0 (in mine).
6. user B understands it is lost race. -> do something with it.

Of course this scenario will work only, if clocks are synced and variation 
between machine clocks of user A and B is much less than duration of write-read 
roundtrip. In practice this means, that you'll need to introduce delays of 50-
100ms between write and last read on both users. So this could work if 100ms 
delay is acceptable.

Otherwise use zookeeper. At least until cassandra does not have version vector 
support implemented.

Umpf, that was long story ;-)


Mime
View raw message