qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fraser Adams <fraser.ad...@blueyonder.co.uk>
Subject Re: C++ broker memory leak in federated set-up???
Date Fri, 09 Mar 2012 10:04:22 GMT
I don't suppose anyone has had any more thoughts on this.

Unfortunately sorting out our network problem hasn't resolved this 
issue, it does take a lot longer for the broker memory to grow but 
unfortunately it still does.

As I say below we've got a 0.8 qpid::client producer delivering to 
amq.match on a broker co-located on the same host which is federated to 
another 0.8 broker (all brokers are c++) via a source queue route.

One weird thing: As an experiment we kept the general topology the same 
but we moved the first broker on to its own host "just to see", so we've 
got the producer on one host writing to amq.match on a broker now on a 
different host with that broker federated to the core broker as before. 
We've had that running for days now and the brokers all seem to be stable!!!

Has anyone seen circumstances that could cause brokers to appear to leak 
memory when co-located with a producer but be fine when run on a 
separate host??

I don't believe that there are any significant differences in the 
dependent libraries on each host, but I couldn't swear to it is anyone 
aware of stability issues say with particular versions of boost and qpid 
or indeed any other library.

Annoyingly I've never noticed things like this in my set up at home, 
just at work where it matters more and I've got deadlines to meet :-(

Can anyone think of a good way to "profile" our hosts to verify that 
they should be able to run qpid with no issues? I always build from 
source at home (that has its own issues on Ubuntu!!!!) but at work I 
believe qpid had been installed from RPMs I'm not clear on the 
provenance of the RPMs though I'm a bit suspicious of them they don't 
appear to have many dependency checks (for example it didn't barf when 
SASL wasn't present but that seems necessary even with --auth no).

So one possibility for the carnage I'm seeing is some hosts might have 
slightly different versions of dependent libraries hence why I'd like to 
know in a systematic manner what to check for.

On a related note is anyone aware of any differences in behaviour 
relating to hardware/chipset? All of our hosts are running RHEL but they 
are a mix of hardware - all Intel but varying numbers of cores and chipsets.

I'm getting a bit desperate now there are project managers with cattle 
prods heading my way :-(


On 01/03/12 15:04, Gordon Sim wrote:
> On 02/29/2012 07:07 PM, Fraser Adams wrote:
>> Hi All,
>> I think that we may have stumbled across a potential memory/resource 
>> leak.
>> We have one particular set up where we have a C++ producer client (using
>> qpid::client - don't ask, it's a long story.....) this writes to a 0.8
>> broker hosted on the same server. That broker is then federated via a
>> queue route to amq.match on another (0.8) broker. The queue route is a
>> source route set up via qpid-route -s
>> We've been having all sorts of fun and games with respect to
>> performance, which we've narrowed down to some dodgy networking.
>> However one of the other effects that we've noticed is that the broker
>> co-located with the producer client eats memory. The queue for the queue
>> route is 1GB but qpidd eventually grows to ~35GB and sends the whole set
>> up into swap.
>> So with respect to the network problem we're suspecting a dodgy switch
>> somewhere, what is interesting is that when we checked with ethtool the
>> NIC was reporting half duplex had been negotiated - ouch!!! hence why we
>> suspect a dodgy switch somewhere.
>> Now when the NIC was explicitly set to 100 base/T full duplex our
>> performance rocketed and the broker on the producer system appears
>> (touch wood) to have stable memory performance.
>> What I'm suspecting is that the dodgy network link has been causing
>> connection drop-outs and the broker is automatically reconnecting (logs
>> are confirming this) and I'm thinking that there is a resource leak
>> somewhere during the reconnection process.
> https://issues.apache.org/jira/browse/QPID-3447 perhaps? Though I 
> wouldn't have expected that to cause such a large growth in memory.
> Your sure there is no backed up queue anywhere?
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org

To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

View raw message