cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Chang <ivan.ch...@medigy.com>
Subject Concurrent updates
Date Fri, 17 Jul 2009 14:14:28 GMT
I have the following scenario that would like a best solution for.

Here's the scenario:

Table1.Standard1['cassandra']['frequency']

it is used for keeping track of how many times the word "cassandra"
appeared.

Let's say we have a bunch of articles stored in Hadoop, a Map/Reduce greps
all articles throughout the Hadoop cluster that matches the pattern
^cassandra$
and updates Table1.Standard1['cassandra']['frequency'].  Hence
Table1.Standard1['cassandra']['frequency'] will be updated concurrently.

One of the issues I am facing is that
Table1.Standard1['cassandra']['frequency']
stores the count as a String (I am using Java), so in order to update the
frequency
properly, the thread that's running the Map/Reduce will have to retrieve
Table1.Standard1['cassandra']['frequency'] in its native String format and
hold
that in temp (java Sttring), convert into int, then add the new counts in,
and finally
"SET Table1.Standard1['cassandra']['frequency']. =  '" + temp.toString() +
''"

During the entire process, how do we guranatee concurrency.  The Cql SET
does
not allow something like

SET Table1.Standard1['cassandra']['frequency']. =
Table1.Standard1['cassandra']['frequency']. + newCounts

since there's only one String type.

What would be the best solution in this situtaion?

Thanks,
Ivan

Mime
View raw message