cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Chang <>
Subject Concurrent updates
Date Fri, 17 Jul 2009 14:14:28 GMT
I have the following scenario that would like a best solution for.

Here's the scenario:


it is used for keeping track of how many times the word "cassandra"

Let's say we have a bunch of articles stored in Hadoop, a Map/Reduce greps
all articles throughout the Hadoop cluster that matches the pattern
and updates Table1.Standard1['cassandra']['frequency'].  Hence
Table1.Standard1['cassandra']['frequency'] will be updated concurrently.

One of the issues I am facing is that
stores the count as a String (I am using Java), so in order to update the
properly, the thread that's running the Map/Reduce will have to retrieve
Table1.Standard1['cassandra']['frequency'] in its native String format and
that in temp (java Sttring), convert into int, then add the new counts in,
and finally
"SET Table1.Standard1['cassandra']['frequency']. =  '" + temp.toString() +

During the entire process, how do we guranatee concurrency.  The Cql SET
not allow something like

SET Table1.Standard1['cassandra']['frequency']. =
Table1.Standard1['cassandra']['frequency']. + newCounts

since there's only one String type.

What would be the best solution in this situtaion?


View raw message