incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roland Gude <>
Subject Re: Atomicity Strategies
Date Sun, 10 Apr 2011 18:46:18 GMT

A Strategy that should Cover at least some use Cases is roughly like this:

Given cf A and B should Be in Sync
In write 'a' to cf A Add another Column 'Synchronisation_token' and Write a tuuid 'T' (or
a timestamp or some Otter Value that Allows (Time based) ordering) As its value.
On the related write to cfB Write the Token As well.
When Reading check Client Side if tokens Match and reread Data with Lower Token until it does.


Am 10.04.2011 um 03:53 sc"aaron morton" <<>>:

My understanding of what they did with locking (based on the examples) was to achieve a level
of transaction isolation <>

I think the issue here is more about atomicity <>

<>We cannot guarantee that all
or none of the mutations in your batch are completed. There is some work in this area though

<>AFAIK the best approach now is
to work at Quourm, and write your code to handle missing relations. Also cassandra does do
a lot of work upfront before the write starts to ensure it will succeed, failures during a
write will probably be due to a SW/HW failure or overload on a node that gossip has not picked

Retrying is the recommended approach when a request fails.

Hope that helps.

On 9 Apr 2011, at 15:58, Dan Washusen wrote:

Here's a good writeup on how <><>
does it...


Dan Washusen
Make big files fly
visit <><>

On Saturday, 9 April 2011 at 11:53 AM, Alex Araujo wrote:

On 4/8/11 5:46 PM, Drew Kutcharian wrote:
I'm interested in this too, but I don't think this can be done with Cassandra alone. Cassandra
doesn't support transactions. I think hector can retry operations, but I'm not sure about
the atomicity of the whole thing.

On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote:

Hi, I was wondering if there are any patterns/best practices for creating atomic units of
work when dealing with several column families and their inverted indices.

For example, if I have Users and Groups column families and did something like:

Users.insert( user_id, columns )
UserGroupTimeline.insert( group_id, { timeuuid() : user_id } )
UserGroupStatus.insert( group_id + ":" + user_id, { "Active" : "True" } )
UserEvents.insert( timeuuid(), { "user_id" : user_id, "group_id" : group_id, "event_type"
: "join" } )

Would I want the client to retry all subsequent operations that failed against other nodes
after n succeeded, maintain an "undo" queue of operations to run, batch the mutations and
choose a strong consistency level, some combination of these/others, etc?

Thanks Drew. I'm familiar with lack of transactions and have read about
people usiing ZK (possibly Cages as well?) to accomplish this, but since
it seems that inverted indices are common place I'm interested in how
anyone is mitigating lack of atomicity to any extent without the use of
such tools. It appears that Hector and Pelops have retrying built in to
their APIs and I'm fairly confident that proper use of those
capabilities may help. Just trying to cover all bases. Hopefully
someone can share their approaches and/or experiences. Cheers, Alex.

View raw message