cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guy Incognito <>
Subject Re: indexes from CassandraSF
Date Sun, 13 Nov 2011 14:52:21 GMT
[1] i'm not particularly worried about transient conditions so that's 
ok.  i think there's still the possibility of a non-transient false 
positive...if 2 writes were to happen at exactly the same time (highly 
unlikely), eg

1) A reads previous location (L1) from index entries
2) B reads previous location (L1) from index entries
3) A deletes previous location (L1) from index entries
4) B deletes previous location (L1) from index entries
5) A deletes previous location (L1) from index
6) B deletes previous location (L1) from index
7) A enters new location (L2) into index entries
8) B enters new location (L3) into index entries
9 ) A enters new location (L2) into index
10) B enters new location (L3) into index
11) A sets new location (L2) on users
12) B sets new location (L2) on users

after this, don't i end up with an incorrect L2 location in index 
entries and in the index, that won't be resolved until the next write of 
location for that user?

[2] ah i the client would continuously retry until the update 
works.  that's fine provided the client doesn't bomb out with some other 
error, if that were to happen then i have potentially deleted the index 
entry columns without deleting the corresponding index columns.

i can handle both of the above for my use case, i just want to clarify 
whether they are possible (however unlikely) scenarios.

On 13/11/2011 02:41, Ed Anuff wrote:
> 1) The index updates should be eventually consistent.  This does mean
> that you can get a transient false-positive on your search results.
> If this doesn't work for you, then you either need to use ZK or some
> other locking solution or do "read repair" by making sure that the row
> you retrieve contains the value you're searching for before passing it
> on to the rest of your applicaiton.
> 2)  You should be able to reapply the batch updates til they succeed.
> The update is idempotent.  One thing that's important that the slides
> don't make clear is that this requires using time-based uuids as your
> timestamp components.  Take a look at the sample code.
> Hope this helps,
> Ed
> On Sat, Nov 12, 2011 at 3:59 PM, Guy Incognito<>  wrote:
>> help?
>> On 10/11/2011 19:34, Guy Incognito wrote:
>>> hi,
>>> i've been looking at the model below from Ed Anuff's presentation at
>>> Cassandra CF (
>>>   Couple of questions:
>>> 1) Isn't there still the chance that two concurrent updates may end up
>>> with the index containing two entries for the given user, only one of which
>>> would be match the actual value in the Users cf?
>>> 2) What happens if your batch fails partway through the update?  If i
>>> understand correctly there are no guarantees about ordering when a batch is
>>> executed, so isn't it possible that eg the previous
>>> value entries in Users_Index_Entries may have been deleted, and then the
>>> batch fails before the entries in Indexes are deleted, ie the mechanism has
>>> 'lost' those values?  I assume this can be addressed
>>> by not deleting the old entries until the batch has succeeded (ie put the
>>> previous entry deletion into a separate, subsequent batch).  this at least
>>> lets you retry at a later time.
>>> perhaps i'm missing something?
>>> SELECT {"location"}..{"location", *}
>>> FROM Users_Index_Entries WHERE KEY =<user_key>;
>>> DELETE {"location", ts1}, {"location", ts2}, ...
>>> FROM Users_Index_Entries WHERE KEY =<user_key>;
>>> DELETE {<value1>,<user_key>, ts1}, {<value2>,<user_key>,
ts2}, ...
>>> FROM Indexes WHERE KEY = "Users_By_Location";
>>> UPDATE Users_Index_Entries SET {"location", ts3} =<value3>
>>> WHERE KEY=<user_key>;
>>> UPDATE Indexes SET {<value3>,<user_key>, ts3) = null
>>> WHERE KEY = "Users_By_Location";
>>> UPDATE Users SET location =<value3>
>>> WHERE KEY =<user_key>;

View raw message