qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fraser Adams <fraser.ad...@blueyonder.co.uk>
Subject Re: C++ broker memory leak in federated set-up???
Date Wed, 14 Mar 2012 18:45:27 GMT
On 12/03/12 23:52, William Henry wrote:
> Are you using acknowledgements on your federated bridge?
> That's what I"m wondering.  You mention one exchange and one queue but
> with federation involved there is more going on.  Which queue is the
> 1GB ring queue? I assume the final queue in the federation. Therefore
> I too wonder if the origin broker filling up because it's not getting
> acks that the messages have been consumed.
>
> ??
Hi Gordon & William.
As far as I'm aware we're not using acknowledgements, what I mean is 
that I believe that the default with qpid-route is for no acks, I think 
it sends messages via an unreliable link. I thought that you explicitly 
had to enable acks with the --ack= option in qpid-route? have I missed 
something?

Re "You mention one exchange and one queue but with federation involved 
there is more going on. Which queue is the 1GB ring queue? I assume the 
final queue in the federation ". I meant on the source broker, which is 
the one causing the pain it's co-located with the producer.

In theory it's quite a simple set-up we have a producer that generates 
data and populates headers describing the different types of data it 
produces, this publishes to amq.match on its local broker the local 
broker has a 1GB ring queue and a simple x-match: all binding to deliver 
all of the messages to that queue, we then have a queue route 
established to our "core" broker. That destination broker has a number 
of consumers each subscribing to some subset of the data that the 
producer is generating and each consumer has a 2GB ring queue.

The route is a source route because we don't want the performance 
penalty of persistence and can tolerate some message loss and by using 
source routes if the producer fails and restarts we don't get into the 
pain whereby "normal" routes would try to reconnect the link then 
silently fail because the queue isn't yet in place, so with source 
routes we have a script that when the broker starts it re-creates the 
queue and the binding then re-establishes the route in the correct order.

So it's not all that complicated, but it's driving me nuts that when the 
source broker is co-located with the producer we have a memory increase, 
but when we host the source broker on a different box it seems to be fine.

As I said in previous posts it seems to be exacerbated when the network 
is dodgy so I too initially suspected an acknowledgement issue, but as I 
say as far as I'm aware federation links are unreliable and don't 
require acknowledgement "by default".

As always Murphys law kicks in and I've never seen this issue in my set 
up at home where I've got a lot more control - only at work where I've 
got pressures of deadlines :-)

The only time I've actually seen qpidd really eat memory is when I 
deliberately screwed up a C++ consumer to not acknowledge so the ack 
thing is plausible except that at the rate we're pushing messages if 
there was something really messed it would crash and burn quicker than 
we're seeing, it's a good deal more subtle than that.

BTW to answer one of Gordon's previous questions we're running RHEL5.4 
however your comment "where a the per-thread pools of memory didn't work 
well in the case of a thread that always worked on producing (hence 
allocating) and another that did all the consuming (hence freeing).". I 
*guess* we must have a similar situation to that with a single queue on 
the source broker and our producer delivering data to amq.match on that 
and the federation link consuming, but it is RHEL5.4 not RHEL6 I've no 
idea how we'd identify if this scenario is affecting us too - is there a 
way to work out if that issue is actually kicking in?


Also I'm thinking it's not a Boost issue (which is one of the other 
straws I was clutching) we've checked the server qpid was built on and 
the boxes giving pain and they are running the same Boost version (1.33 
something I think).

I'd appreciate any more thoughts you guys may have.

Cheers,
Frase





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Mime
View raw message