cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Concurrent updates
Date Fri, 17 Jul 2009 15:41:56 GMT
This is the kind of inconsistency that vector clocks can handle but
the more simplistic timestamp-based resolution cannot.

Of test-and-set vs vector clocks, vector clocks fits cassandra much better.

-Jonathan

On Fri, Jul 17, 2009 at 9:59 AM, Jun Rao<junrao@almaden.ibm.com> wrote:
> This is a case where a test-and-set feature would be useful. See the
> following JIRA. We just don't have it nailed down yet.
> https://issues.apache.org/jira/browse/CASSANDRA-48
>
> Jun
> IBM Almaden Research Center
> K55/B1, 650 Harry Road, San Jose, CA 95120-6099
>
> junrao@almaden.ibm.com
>
> Ivan Chang <ivan.chang@medigy.com>
>
>
> Ivan Chang <ivan.chang@medigy.com>
>
> 07/17/2009 07:14 AM
>
> Please respond to
> cassandra-user@incubator.apache.org
>
> To
> cassandra-user@incubator.apache.org
> cc
>
> Subject
> Concurrent updates
> I have the following scenario that would like a best solution for.
>
> Here's the scenario:
>
> Table1.Standard1['cassandra']['frequency']
>
> it is used for keeping track of how many times the word "cassandra"
> appeared.
>
> Let's say we have a bunch of articles stored in Hadoop, a Map/Reduce greps
> all articles throughout the Hadoop cluster that matches the pattern
> ^cassandra$
> and updates Table1.Standard1['cassandra']['frequency'].  Hence
> Table1.Standard1['cassandra']['frequency'] will be updated concurrently.
>
> One of the issues I am facing is that
> Table1.Standard1['cassandra']['frequency']
> stores the count as a String (I am using Java), so in order to update the
> frequency
> properly, the thread that's running the Map/Reduce will have to retrieve
> Table1.Standard1['cassandra']['frequency'] in its native String format and
> hold
> that in temp (java Sttring), convert into int, then add the new counts in,
> and finally
> "SET Table1.Standard1['cassandra']['frequency']. =  '" + temp.toString() +
> ''"
>
> During the entire process, how do we guranatee concurrency.  The Cql SET
> does
> not allow something like
>
> SET Table1.Standard1['cassandra']['frequency']. =
> Table1.Standard1['cassandra']['frequency']. + newCounts
>
> since there's only one String type.
>
> What would be the best solution in this situtaion?
>
> Thanks,
> Ivan
>

Mime
View raw message