Since twitter is everyone's favorite analogy:
It's like twitter, but faster and with bigger messages that I may need to go back and replay in order to mine for more details at a later date.
Thus, I call it a queue, because the order of messages is important.. But not anything like a message broker/pub-sub/topic/ etc...

-JD



On Thu, Apr 1, 2010 at 9:43 PM, Jeremy Davis <jerdavis.cassandra@gmail.com> wrote:

You are correct, it is not a queue in the classic sense... I'm storing the entire "conversation" with a client in perpetuity, and then playing it back in the order received.

Rabbitmq/activemq etc all have about the same throughput 3-6K persistent messages/sec, and are not good for storing the conversation forever... Also I can easily scale cassandra past that message rate and not have to worry about which message broker/cluster I'm connecting to/has the conversation/etc.




On Thu, Apr 1, 2010 at 7:02 PM, Keith Thornhill <keith@raptr.com> wrote:
you mention never deleting from the queue, so what purpose is this
serving? (if you don't pop off the front, is it really a queue?)

seems if guaranteed order of messages is required, there are many
other projects which are focused towards that problem (rabbitmq,
kestrel, activemq, etc)

or am i misunderstanding your needs here?

-keith

On Thu, Apr 1, 2010 at 6:32 PM, Jeremy Davis
<jerdavis.cassandra@gmail.com> wrote:
> I'm in the process of implementing a Totally Ordered Queue in Cassandra, and
> wanted to bounce my ideas off the list and also see if there are any other
> suggestions.
>
> I've come up with an external source of ID's that are always increasing (but
> not monotonic), and I've also used external synchronization to ensure only
> one writer to a given queue. And I handle de-duping in the app.
>
>
> My current solution is : (simplified)
>
> Use the "QueueId", to Key into a row of a CF.
> Then, every column in that CF corresponds to a new entry in the Queue, with
> a custom Comparator to sort the columns by my external ID that is always
> increasing.
>
> Technically I never delete data from the Queue, and I just page through it
> from a given ID using a SliceRange, etc.
>
> Obviously the problem being that the row needs to get compacted. so then I
> started bucketizing with multiple rows for a given queue (for example one
> per day (again I'm simplifying))...(so the Key is now "QueueId+Day"...)
>
> Does this seem reasonable? It's solvable, but is starting to seem
> complicated to implement... It would be very easy if I didn't have to have
> multiple buckets..
>
>
>
> My other thought is to store one entry per row, and perform get_range_slices
> and specify a KeyRange, with the OrderPreservingPartitioner.
> But it isn't exactly clear to me what the Order of the keys are in this
> system, so I don't know how to construct my key and queries appropriately...
> Is this Lexical String Order? Or?
>
> So for example.. Assuming my QueueId's are longs, and my ID's are also
> longs.. My key would be (in Java):
>
> long queueId;
> long msgId;
>
> key = "" + queueId + ":" + msgId;
>
> And if I wanted to do a query my key range might be from
> start = "" + queueId + ":0"
> end = "" + queueId + ":" + Long.MAX_VALUE;
>
> (Will I have to left pad the msgIds with 0's)?
>
> And is this going to be efficient if my msgId isn't monotonically
> increasing?
>
> Thanks,
> -JD
>
>
>
>
>
>
>
>
>
>
>
>
>