cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Verlangen <ro...@us2.nl>
Subject Re: Delayed events processing / queue (anti-)pattern
Date Wed, 25 Mar 2015 07:45:41 GMT
Hi there,

@Robert: can you elaborate a bit more on the "not ideal" parts? In my case
I will be throwing away the rows (thus the points in time that are "now in
the past"), which will create tombstones which are compacted away.

@DuyHai: that was exactly what I had in mind and from a C* point of view
this should work as it's write heavy. I add hundreds of thousands of
columns to a key, and then read them all at once (or maybe a few times with
pagination), and then remove the entire row by it's primary key.

Any other thoughts on this?

Best regards,

Robin Verlangen
*Chief Data Architect*

W http://www.robinverlangen.nl
E robin@us2.nl

<http://goo.gl/Lt7BC>
*What is CloudPelican? <http://goo.gl/HkB3D>*

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Tue, Mar 24, 2015 at 10:01 PM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> Some ideas I throw in here:
>
> "The delay Y will be at least 1 minute, and at most 90 days with a
> resolution per minute" --> Use the delay (with format YYYYMMDDHHMM as
> integer) as your partition key.
>
> Example: today March 24th at 12:00 (201502241200) you need to delay 3
> actions, action A in exact 3 days, action B in 10 hours and action C in 5
> minutes. Thus you will create 3 partitions:
>
> - for A, partition key = 201503271200
> - for B, partition key = 201503242200
> - for C, partition key = 201503241205
>
> In each partition, you'll need to create as many clustering columns as
> there are actions to execute. According to your estimate, the average is a
> few hundred thousands and the max is a few millions so it's fine. Also, you
> would have a pool of worker which will load the whole partition (with
> paging when necessary) every minute and process the actions.
>
> Once all the actions have been executed, you can either remove the
> complete partition or keep them for archiving.
>
> Duy Hai DOAN
>
> On Tue, Mar 24, 2015 at 9:19 PM, Robert Coli <rcoli@eventbrite.com> wrote:
>
>> On Tue, Mar 24, 2015 at 5:05 AM, Robin Verlangen <robin@us2.nl> wrote:
>>
>>> - for every point in the future there are probably hundreds of actions
>>> which have to be processed
>>> - all actions for a point in time will be processed at once (thus not
>>> removing action by action as a typical queue would do)
>>> - once all actions have been processed we remove the entire row (by key,
>>> not the individual columns)
>>>
>>
>> I've used Cassandra for similar queue-like things, and it's "fine." Not
>> ideal, but number of objects and access patterns are "fine."
>>
>>
>> https://engineering.eventbrite.com/replayable-pubsub-queues-with-cassandra-and-zookeeper/
>>
>> This design never truncates history, but if you can tolerate throwing
>> away history, that problem goes away..
>>
>> =Rob
>>
>>
>
>

Mime
View raw message