cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@almaden.ibm.com>
Subject Re: Concurrent updates
Date Fri, 17 Jul 2009 14:59:09 GMT

This is a case where a test-and-set feature would be useful. See the
following JIRA. We just don't have it nailed down yet.
https://issues.apache.org/jira/browse/CASSANDRA-48

Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA  95120-6099

junrao@almaden.ibm.com



                                                                           
             Ivan Chang                                                    
             <ivan.chang@medig                                             
             y.com>                                                     To 
                                       cassandra-user@incubator.apache.org 
             07/17/2009 07:14                                           cc 
             AM                                                            
                                                                   Subject 
                                       Concurrent updates                  
             Please respond to                                             
             cassandra-user@in                                             
             cubator.apache.or                                             
                     g                                                     
                                                                           
                                                                           




I have the following scenario that would like a best solution for.

Here's the scenario:

Table1.Standard1['cassandra']['frequency']

it is used for keeping track of how many times the word "cassandra"
appeared.

Let's say we have a bunch of articles stored in Hadoop, a Map/Reduce greps
all articles throughout the Hadoop cluster that matches the pattern
^cassandra$
and updates Table1.Standard1['cassandra']['frequency'].  Hence
Table1.Standard1['cassandra']['frequency'] will be updated concurrently.

One of the issues I am facing is that Table1.Standard1
['cassandra']['frequency']
stores the count as a String (I am using Java), so in order to update the
frequency
properly, the thread that's running the Map/Reduce will have to retrieve
Table1.Standard1['cassandra']['frequency'] in its native String format and
hold
that in temp (java Sttring), convert into int, then add the new counts in,
and finally
"SET Table1.Standard1['cassandra']['frequency']. =  '" + temp.toString() +
''"

During the entire process, how do we guranatee concurrency.  The Cql SET
does
not allow something like

SET Table1.Standard1['cassandra']['frequency']. = Table1.Standard1
['cassandra']['frequency']. + newCounts

since there's only one String type.

What would be the best solution in this situtaion?

Thanks,
Ivan
Mime
View raw message