qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fraser Adams <fraser.ad...@blueyonder.co.uk>
Subject Re: Qpid Java Broker with persistance supports only finite number of messages?
Date Thu, 05 Jan 2012 18:30:37 GMT
Just to jump in on this thread.

Re "

but my opinion
is that if you have "millions of messages" then a Message Broker is the
wrong solution to your problem - you want a Database.

"
I can't say I agree with Rob's assertion here!!

Well maybe that's a reasonable comment if the *intention* is to have 
millions of messages hanging around, but what if it's due to an 
unfortunate circumstance......

So one classic scenario might be connecting over a WAN when the WAN goes 
down messages build up....it's not what I want, but it's what'll happen.

In my scenario I'm actually federating between C++ brokers and I'm using 
queue routes and boxes with loads of memory so I can have big queues. 
I'm also using circular queues cause I don't want things to die if I 
eventually use up all of my capacity.

In the C++ broker flow to disk works OK sort off, but you can't have it 
bigger than your available memory and also have things circular (TBH 
it's all a little untidy as Gordon Sim will I'm sure tell you).

For my core use case performance is more important than (modest) message 
loss, so I'm not using persistence (my view is that the C++ broker is 
more reliable than the disk :-) and if I see any issues I'm likely to 
federate to multiple load balanced brokers on different power supplies - 
or even different locations). To be fair I'm not over keen on the 
eventual data loss I'm going to get if the WAN dies and I hit the limit 
of my circular queue, but my cunning plan is to make use of the QMF 
queueThresholdExceeded event (don't know if that exists for the Java 
broker).

I've already written a Java QMF application that intercepts 
queueThresholdExceeded and uses that to trigger a QMF purge method on a 
queue, so the QMF client basically acts as a "fuse" to prevent slow 
consumers taking out message producers. I think that it's likely to be 
pretty simple to extend this idea so that rather than purging the queue 
I redirect the messages to a queue that has a persisting consumer, so in 
essence I only actually trigger a flow to disk when I have a slow 
consumer (or in my case a dead WAN).

Frase

On 05/01/12 17:51, Praveen M wrote:
> That was really useful. Thanks for writting Rob.
>
> On Wed, Jan 4, 2012 at 5:08 PM, Rob Godfrey<rob.j.godfrey@gmail.com>  wrote:
>
>> Robbie beat me to replying, and said mostly what I was going to say... but
>> anyway ...
>>
>> This design decision is before even my time on the project, but my opinion
>> is that if you have "millions of messages" then a Message Broker is the
>> wrong solution to your problem - you want a Database...  That being said I
>> think there is a reasonable argument to be made that a broker should be
>> able to run in a low memory environment in which case you may want to be
>> able to swap out sections of the list structure.
>>
>> Fundamentally this would be quite a major change.  The internal queue
>> design is predicated on the list structure being in memory and being able
>> to apply lockless atomic operations on it.  From a performance point of
>> view it would also be potentially very tricky. There are a number of
>> processes within the broker which periodically scan the entire queue
>> (looking for expired messages and such) as well the fact (as Robbie pointed
>> out) many use cases are not strict FIFO (priority queues, LVQs, selectors,
>> etc).  And ultimately you are still going to be limited by a finite
>> resource (albeit a larger one). That being said we should definitely look
>> at trying to reduce the per-message overhead.
>>
>> At some point soon I hope to be looking at implementing a flow-to-disk
>> policy for messages within queues to protect the broker from out of memory
>> situations with transient messages (as well as improving on the rather
>> drastic SoftReference based method currently employed).  The main issue
>> though is that the reason why you need to flow to disk is that your
>> producers are producing faster than your consumers can consume... and
>> pushing messages (or worse queue structure too) to disk is only going to
>> slow your consumers down even more - likely making the problem even worse.
>>
>> Cheers,
>> Rob
>>
>> On 4 January 2012 22:09, Praveen M<lefthandmagic@gmail.com>  wrote:
>>
>>> Thanks for the explanation Robbie.
>>>
>>> On Wed, Jan 4, 2012 at 1:12 PM, Robbie Gemmell<robbie.gemmell@gmail.com
>>>> wrote:
>>>> Hi Praveen,
>>>>
>>>> I can only really guess to any design decision on that front as it
>>>> would have been before my time with the project, but I'd say its
>>>> likely just that way because theres never been a strong need / use
>>>> case that actually required doing anything else. For example, with
>>>> most of the users I liase with the data they are using has at least
>>>> some degree of time sensitivity to it and having anywhere near that
>>>> volume of persistent data in the broker would represent some sort of
>>>> ongoing period of catastrophic failure in their application. I can
>>>> only really think of one group who make it into multi-million message
>>>> backlogs at all, and that usually includes having knowingly published
>>>> things which noone will ever consume.
>>>>
>>>> For a FIFO queue you are correct it would 'just' need to load in more
>>>> as required. Things get trickier when dealing with some of the other
>>>> queue types however, such as as LVQ/conflation and the recently added
>>>> Sorted queue types. Making the broker able to hold partial segments of
>>>> the queue in memory is something we have discussed doing in the past
>>>> for other reasons, but message volume hasnt really been a significant
>>>> factor in those considerations until now. I will take note of it for
>>>> any future work we do in that area though.
>>>>
>>>> Robbie
>>>>
>>>> On 1 January 2012 17:46, Praveen M<lefthandmagic@gmail.com>  wrote:
>>>>> Hi,
>>>>>
>>>>> I was digging in the code base and was trying to understand how the
>>>> broker
>>>>> is implemented.
>>>>> I see that for each message enqueued there are certain objects kept
>> in
>>>>> memory one for each message.
>>>>>
>>>>> example: MessageTransferReference, SimpleQueueEntryImpl etc.
>>>>>
>>>>> I tried computing the memory footprint of each individual message and
>>> it
>>>>> amounts about 320 bytes/message.
>>>>> I see that because of the footprint of each message,  if i'm limited
>> to
>>>> 4GB
>>>>> of memory, then I am limited to only about 13 million messages in the
>>>>> system at one point.
>>>>>
>>>>> Since I'm using a persistent store I'd have expected to go over 13
>>>> million
>>>>> messages and be limited by disk store rather than physical memory,
>> but
>>>>> I realized this isn't the case.
>>>>>
>>>>> I am curious as to what were the driving points for this design
>>> decision
>>>> to
>>>>> keep a reference to every message in memory. I'd have expected in a
>>> FIFO
>>>>> queue you just need a subset of messages in memory and can  pull in
>>>>> messages on demand rather than maintain reference to every message in
>>>>> memory.
>>>>>
>>>>> Can someone please explain as to the reasons for this design? Also,
>> was
>>>> it
>>>>> assumed that we'd never flood the queues over 13 million messages at
>>> one
>>>>> time. Was there a bound
>>>>> decided upon?
>>>>>
>>>>> Thank you,
>>>>> Praveen
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -Praveen
>>>> ---------------------------------------------------------------------
>>>> Apache Qpid - AMQP Messaging Implementation
>>>> Project:      http://qpid.apache.org
>>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>>>
>>>>
>>>
>>> --
>>> -Praveen
>>>
>
>


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Mime
View raw message