storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladi Feigin <vladi...@gmail.com>
Subject Re: Cassandra bolt
Date Fri, 10 Jan 2014 14:37:52 GMT
Hi,
If you use Cassandra counters, eventually you will have 8 value in all
nodes.
3 will not override 5 or vice verse.
Certainly it's going to happen eventually and during some time you could be
possible seeing different values from different clients but finally it will
be 8
Vladi



On Mon, Jan 6, 2014 at 5:21 PM, Adrian Mocanu <amocanu@verticalscope.com>wrote:

>  Hi
>
> I am actually looking into using CassandraCounterBatchingBolt but atm I’m
> not sure how Cassandra handles these eventual consistency issues so I need
> to research that. The reason I mention this issues is because I cannot find
> anywhere in the code where before a write there is a read .. which bothers
> me .. maybe Cassandra does it w counter columns? IDK.
>
>
>
> The issue I’m talking ab is updating the same counter consecutively, but
> faster than the updates propagate to  other Cassandra nodes.
>
>
>
> Example:
>
> Say I have 3 cassandra nodes. The counters on each of these nodes are 0.
>
> Node1:0, node2:0, node3:0
>
>
>
> An increment comes: 5
>
> 5 -> Node1:0, node2:0, node3:0
>
>
>
> Increment starts at node 5 – still needs to propagate to node1 and node3
>
> Node1:0, node2:5, node3:0
>
>
>
> In the meantime, another increment arrives before previous increment is
> propagated:
>
> 3 -> Node1:0, node2:5, node3:0
>
>
>
> Assuming 3 starts at a different node than where 5 started we have:
>
> Node1:3, node2:5, node3:0
>
>
>
> Now if 3 gets propagated to the other nodes AS AN INCREMENT and not as a
> new value (and the same for 5) then eventually they would all equal 8 and
> this is what I want.
>
>
>
> If 3 overwrites 5 (because it has a later timestamp) this is problematic –
> not what I want.
>
>
>
> Will see what the Cassandra group says... or if the creators of
> CassandraCounterBatchingBolt is on this group please let me know J
>
>
>
> Thanks
>
> Adrian
>
>
>
>
>
> *From:* Vladi Feigin [mailto:vladif86@gmail.com]
> *Sent:* January-04-14 2:00 AM
>
> *To:* user@storm.incubator.apache.org
> *Subject:* Re: Cassandra bolt
>
>
>
> Hi Adrian,
>
>
>
> Why you don't use C* counters? Looks like your scenario fits for this. I
> think CassandraCounterBatchingBolt provides  what you need
>
> Vladi
>
>
>
> On Fri, Jan 3, 2014 at 11:00 PM, Adrian Mocanu <amocanu@verticalscope.com>
> wrote:
>
>  Happy New Year all!
>
>
>
> I'm working on a solution for the following scenario: I have tuples coming
> to a cassandra bolt. The tuples are of this form: TupleData(String name,
> Int count, Long time) Time field is unique per batch only but not overall
> because some tuples may come in late but have the same name and time but
> different count.
>
>
>
> For example:
>
> I can receive these tuples for the same time: (x1,3,1111), (x2,4,1111)
>
> Then the bolt may receive (x1,5,1111)
>
> After these are put in cassandra, column family x1 should have value 8 for
> time 1111 and column family x2 should have value 4 for time 1111
>
>
>
> Caching aside, cassandra bolt needs to check if there is a count already
> in the db for the tuple with given name and time. If it does exist then
> retrieve, increment it with newly received value, and update db exntry w
> the new value. (At this point I'm not sure if update or delete+reinsert is
> speedier)
>
> If no db entry exists, then add the new tuple.
>
>
>
> I've looked at cassandra bolts code from
> https://github.com/hmsonline/storm-cassandra/tree/master/src/main/java/com/hmsonline/storm/cassandra/bolt
>
> which is the same as cassandra bolt from storm-contrib.
>
>
>
> There is a class CassandraCounterBatchingBolt, but after looking at it I
> don't believe it does the look up in db first before saving the value to
> db, which leads me to believe that this will not work.
>
>
>
> What I'm looking for seems pretty basic and I wonder if there is a
> cassandra bolt to do db lookup before updating db. Does such a bolt exist
> open-sourced?
>
> Otherwise I'm thinking of building mine on top of CassandraBatchingBolt.
>
>
>
> -Adrian
>
>
>
>
>

Mime
View raw message