cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Algermissen <jan.algermis...@nordsc.com>
Subject Re: Exploring Simply Queueing
Date Mon, 06 Oct 2014 20:35:46 GMT
Shane,

On 06 Oct 2014, at 16:34, Shane Hansen <shanemhansen@gmail.com> wrote:

> Sorry if I'm hijacking the conversation, but why in the world would you want
> to implement a queue on top of Cassandra? It seems like using a proper queuing service
> would make your life a lot easier.

Agreed - however, the use case simply does not justify the additional operations.

> 
> That being said, there might be a better way to play to the strengths of C*. Ideally
everything you do
> is append only with few deletes or updates. So an interesting way to implement a queue
might be
> to do one insert to put the job in the queue and another insert to mark the job as done
or in process
> or whatever. This would also give you the benefit of being able to replay the state of
the queue.

Thanks, I’ll try that, too.

Jan


> 
> 
> On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <jan.algermissen@nordsc.com> wrote:
> Chris,
> 
> thanks for taking a look.
> 
> On 06 Oct 2014, at 04:44, Chris Lohfink <clohfink@blackbirdit.com> wrote:
> 
> > It appears you are aware of the tombstones affect that leads people to label this
an anti-pattern.  Without "due" or any time based value being part of the partition key means
you will still get a lot of buildup.  You only have 1 partition per shard which just linearly
decreases the tombstones.  That isn't likely to be enough to really help in a situation of
high queue throughput, especially with the default of 4 shards.
> 
> Yes, dealing with the tombstones effect is the whole point. The work loads I have to
deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages
per second.The emphasis is also more on coordinating producer and consumer than on high volume
capacity problems.
> 
> Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the
partition keys and use the current time to select the active partitions (e.g. the shards of
the hour). Once an hour has passed, the corresponding shards will never be touched again.
> 
> Am I understanding this correctly?
> 
> >
> > You may want to consider switching to LCS from the default STCS since re-writing
to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios,
with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput.
In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones.
 Would be hard to identify difference in small tests I think.
> 
> Thanks, I’ll try to explore the various effects
> 
> >
> > Whats the plan to prevent two consumers from reading same message off of a queue?
 You mention in docs you will address it at a later point in time but its kinda a biggy. 
Big lock & batch reads like astyanax recipe?
> 
> I have included a static column per shard to act as a lock (the ’lock’ column in
the examples) in combination with conditional updates.
> 
> I must admit, I have not quite understood what Netfix is doing in terms of coordination
- but since performance isn’t our concern, CAS should do fine, I guess(?)
> 
> Thanks again,
> 
> Jan
> 
> 
> >
> > ---
> > Chris Lohfink
> >
> >
> > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <jan.algermissen@nordsc.com> wrote:
> >
> >> Hi,
> >>
> >> I have put together some thoughts on realizing simple queues with Cassandra.
> >>
> >> https://github.com/algermissen/cassandra-ruby-queue
> >>
> >> The design is inspired by (the much more sophisticated) Netfilx approach[1]
but very reduced.
> >>
> >> Given that I am still a C* newbie, I’d be very glad to hear some thoughts
on the design path I took.
> >>
> >> Jan
> >>
> >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
> >
> 
> 


Mime
View raw message