activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guillaume Nodet" <gno...@gmail.com>
Subject Re: [IDEA] great aggregation without a database, or performing reliable map/reduce using Message Groups...
Date Sun, 13 Jan 2008 16:24:36 GMT
That would really awesome :-)

Aggregate messages in a reliable and efficient way is always a pain
and having such a feature would handle most of the usecases imho.

For long running transactions, we might need the 'dispatch when the
message group is complete' feature.  Even more complex would be to
maybe use some kind of selector so that the broker knows when the
message group is complete without relying on the number of messages.
For example, you could use a time based expiration so that the broker
sends all the message when the expiration date has been reached or
even combine predicates using the selector syntax.
But I guess this is really more advanced stuff that we don't really need
right now ... ;-)

On Jan 13, 2008 5:12 PM, James Strachan <james.strachan@gmail.com> wrote:

> ... I just wanted to explain an awesome idea that popped out of a bit
> of brainstorming while sat at breakfast with Guillaume today.
> Aggregating messages reliably, over long periods of time in a high
> performance way is kinda sucky right now. Either you use batches like
> the default Camel Aggregator
> http://activemq.apache.org/camel/aggregator.html
>
> (which is a bit sucky) or you have to use a database (and then hit 2
> phase commit type issues).
>
> The problem basically is that you want a consumer to process an entire
> group of messages, in order, in a single transaction to avoid having
> to use persistence; if they fail the entire group of messages need to
> be redelivered (maybe to another consumer) in full - otherwise you
> have to use persistence to maintain partial state.
>
> Also you only want a single consumer to get a single group of messages
> at once as the consumer can only commit a single group of messages at
> a time (it can't interleave them).
>
> So its kinda like the consumer wants to do a selector that kinda says
> 'send me one complete message group including the last closing message
> of the sequence, in order please'.
>
> At first we pondered about selectors - then we had the idea of having
> a kind of 'exclusive message group'; namely that using Message
> Groups...
> http://activemq.apache.org/message-groups.html
>
> we could maybe limit the consumer to only support one message group at
> once until its closed, then it can consume another message group. This
> in itself is a pretty good solution that could work today fairly
> easily (we just need the chooser code associating a new message group
> to a consumer to ignore consumers that already have a message group
> associated with them if they have 'exclusive message group' enabled).
>
> The main downside with this approach is that a single consumer can
> then get locked by a single message group, that could span hours, days
> or weeks - unable to process any more messages until finally the
> message group is completed.
>
> So what would be really nice would be is if we supported the
> JMSXGroupSeq header (for sequence numbers within a message group) and
> made it possible to not dispatch any messages within a message group,
> until the sequence is complete; so they'd stay around in the broker
> until the sequence can be processed, in one brief amount of time by a
> single consumer. Also we should support reordering of messages within
> the sequence as well if the messages get out of order. Then folks
> could do long term aggregation using purely ActiveMQ in a nice high
> performance way!
>
> Another added benefit would be folks could do a totally asynchronous,
> loosely coupled and reliable Map/Reduce pattern using purely ActiveMQ.
> e.g. we can split a single message using the splitter
> http://activemq.apache.org/camel/splitter.html
>
> using the message ID as the JMSXGroupID and for each child message we
> assign a JMSXGroupSeq until the last message we close the sequence.
> Then each message can be processed by any consumer in a grid, sending
> replies back to another queue using the same JMSXGroupID and
> JMSXGroupSeq. Then when all the messages are received and the sequence
> is complete, the entire sequence of response messages is sent to a
> single consumer; who then commits its transaction when the last
> message in the sequence is processed. All without any explicit
> persistence - yet the whole thing would be totally transactional and
> reliable! Not bad eh?
>
> As a first stab, it'd be nice to support the 'exclusive message group'
> idea which seems pretty easy to do; as then at least we'd have an
> awesome solution for Map/Reduce scenarios which complete within a
> relatively short amount of time; we'd only need the more advanced
> 'dispatch when the message group is complete' option for dealing with
> very long running Map/Reduce problems - which are probably fairly
> rare.
>
> Thoughts?
> --
> James
> -------
> http://macstrac.blogspot.com/
>
> Open Source Integration
> http://open.iona.com
>



-- 
Cheers,
Guillaume Nodet
------------------------
Blog: http://gnodet.blogspot.com/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message