activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <>
Subject Re: [IDEA] great aggregation without a database, or performing reliable map/reduce using Message Groups...
Date Mon, 14 Jan 2008 16:36:42 GMT
I'm the concept - worried about the implementation - and the failover  
edge cases :)

On Jan 14, 2008, at 3:45 PM, Hiram Chirino wrote:

> I like it :)
> On Jan 13, 2008 11:24 AM, Guillaume Nodet <> wrote:
>> That would really awesome :-)
>> Aggregate messages in a reliable and efficient way is always a pain
>> and having such a feature would handle most of the usecases imho.
>> For long running transactions, we might need the 'dispatch when the
>> message group is complete' feature.  Even more complex would be to
>> maybe use some kind of selector so that the broker knows when the
>> message group is complete without relying on the number of messages.
>> For example, you could use a time based expiration so that the broker
>> sends all the message when the expiration date has been reached or
>> even combine predicates using the selector syntax.
>> But I guess this is really more advanced stuff that we don't really  
>> need
>> right now ... ;-)
>> On Jan 13, 2008 5:12 PM, James Strachan <>  
>> wrote:
>>> ... I just wanted to explain an awesome idea that popped out of a  
>>> bit
>>> of brainstorming while sat at breakfast with Guillaume today.
>>> Aggregating messages reliably, over long periods of time in a high
>>> performance way is kinda sucky right now. Either you use batches  
>>> like
>>> the default Camel Aggregator
>>> (which is a bit sucky) or you have to use a database (and then hit 2
>>> phase commit type issues).
>>> The problem basically is that you want a consumer to process an  
>>> entire
>>> group of messages, in order, in a single transaction to avoid having
>>> to use persistence; if they fail the entire group of messages need  
>>> to
>>> be redelivered (maybe to another consumer) in full - otherwise you
>>> have to use persistence to maintain partial state.
>>> Also you only want a single consumer to get a single group of  
>>> messages
>>> at once as the consumer can only commit a single group of messages  
>>> at
>>> a time (it can't interleave them).
>>> So its kinda like the consumer wants to do a selector that kinda  
>>> says
>>> 'send me one complete message group including the last closing  
>>> message
>>> of the sequence, in order please'.
>>> At first we pondered about selectors - then we had the idea of  
>>> having
>>> a kind of 'exclusive message group'; namely that using Message
>>> Groups...
>>> we could maybe limit the consumer to only support one message  
>>> group at
>>> once until its closed, then it can consume another message group.  
>>> This
>>> in itself is a pretty good solution that could work today fairly
>>> easily (we just need the chooser code associating a new message  
>>> group
>>> to a consumer to ignore consumers that already have a message group
>>> associated with them if they have 'exclusive message group'  
>>> enabled).
>>> The main downside with this approach is that a single consumer can
>>> then get locked by a single message group, that could span hours,  
>>> days
>>> or weeks - unable to process any more messages until finally the
>>> message group is completed.
>>> So what would be really nice would be is if we supported the
>>> JMSXGroupSeq header (for sequence numbers within a message group)  
>>> and
>>> made it possible to not dispatch any messages within a message  
>>> group,
>>> until the sequence is complete; so they'd stay around in the broker
>>> until the sequence can be processed, in one brief amount of time  
>>> by a
>>> single consumer. Also we should support reordering of messages  
>>> within
>>> the sequence as well if the messages get out of order. Then folks
>>> could do long term aggregation using purely ActiveMQ in a nice high
>>> performance way!
>>> Another added benefit would be folks could do a totally  
>>> asynchronous,
>>> loosely coupled and reliable Map/Reduce pattern using purely  
>>> ActiveMQ.
>>> e.g. we can split a single message using the splitter
>>> using the message ID as the JMSXGroupID and for each child message  
>>> we
>>> assign a JMSXGroupSeq until the last message we close the sequence.
>>> Then each message can be processed by any consumer in a grid,  
>>> sending
>>> replies back to another queue using the same JMSXGroupID and
>>> JMSXGroupSeq. Then when all the messages are received and the  
>>> sequence
>>> is complete, the entire sequence of response messages is sent to a
>>> single consumer; who then commits its transaction when the last
>>> message in the sequence is processed. All without any explicit
>>> persistence - yet the whole thing would be totally transactional and
>>> reliable! Not bad eh?
>>> As a first stab, it'd be nice to support the 'exclusive message  
>>> group'
>>> idea which seems pretty easy to do; as then at least we'd have an
>>> awesome solution for Map/Reduce scenarios which complete within a
>>> relatively short amount of time; we'd only need the more advanced
>>> 'dispatch when the message group is complete' option for dealing  
>>> with
>>> very long running Map/Reduce problems - which are probably fairly
>>> rare.
>>> Thoughts?
>>> --
>>> James
>>> -------
>>> Open Source Integration
>> --
>> Cheers,
>> Guillaume Nodet
>> ------------------------
>> Blog:
> -- 
> Regards,
> Hiram
> Blog:
> Open Source SOA

View raw message