incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vram Kouramajian <vram.kouramaj...@gmail.com>
Subject Re: Distributed work-queues?
Date Sat, 26 Jun 2010 22:24:00 GMT
We have implemented a distributed queue (similar to AWS SQS)  and a job
queue in Cassandra.

Vram


On Sat, Jun 26, 2010 at 1:56 PM, Andrew Miklas <andrew@pagerduty.com> wrote:

> Hi all,
>
> Has anyone written a work-queue implementation using Cassandra?
>
> There's a section in the UseCase wiki page for "A distributed Priority Job
> Queue" which looks perfect, but unfortunately it hasn't been filled in yet.
> http://wiki.apache.org/cassandra/UseCases#A_distributed_Priority_Job_Queue
>
> I've been thinking about how best to do this, but every solution I've
> thought of seems to have some serious drawback.  The "range ghost" problem
> in particular creates some issues.  I'm assuming each job has a row within
> some column family, where the row's key is the time at which the job should
> be run.  To find the next job, you'd do a range query with a start a few
> hours in the past, and an end at the current time.  Once a job is completed,
> you delete the row.
>
> The problem here is that you have to scan through deleted-but-not-yet-GCed
> rows each time you run the query.  Is there a better way?
>
> Preventing more than one worker from starting the same job seems like it
> would be a problem too.  You'd either need an external locking manager, or
> have to use some other protocol where workers write their ID into the row
> and then immediately read it back to confirm that they are the owner of the
> job.
>
> Any ideas here?  Has anyone come up with a nice implementation?  Is
> Cassandra not well suited for queue-like tasks?
>
>
>
> Thanks,
>
>
> Andrew
>

Mime
View raw message