qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gordon Sim <g...@redhat.com>
Subject Re: Broker death recovery
Date Wed, 04 Jan 2012 11:09:42 GMT
On 01/03/2012 08:18 PM, Rob Springer wrote:
> Hi all,
> In our application (we've tried both 0.5 and 0.12), we'd like for our
> client programs to quickly recover in the case where a broker dies.
> Currently, we're able to do this by dynamically allocating all our
> Qpid-using code, and simply re-allocating should the broker die, but
> that's seems inelegant and feels...wrong.
> If we attempt to reconnect and don't create a new Session (i.e., use the
> old one), bad things happen (since Session doesn't yet support resume(),
> I assume that's expected behavior).
> When we then try to create a new Session, a new SubscriptionManager, and
> a new Subscription, we get an assertion failure (backtrace at the end of
> this message).
> After reading the backtrace, I believe the following is happening:
> 1) In recovery, we attempt to assign a new Subscription to the previous
> Subscription variable (i.e., "sub = subMgr->subscribe()")
> 2) That causes the refcount for the old Subscription to fall to 0,
> causing it to be cleaned up.
> 3) As part of that cleanup, the associated SubscriptionImpl object
> goes to destroy its (std::auto_ptr<ScopedDivert>) demuxRule member.
> 4) That demuxRule member maintains a reference to a Demux object,
> demuxer, which exists inside the Session object.
> Thus, we have a fatal circle - we need to create a new Session object to
> be able to proceed, but when we do so, we render ourselves unable to
> re-use Subscription variables.
> Unfortunately, I can't think of an easy/simple fix, besides perhaps
> adding reference counting to the Demux variable...although I haven't
> thought that through at all.

As a workaround, can you first assign a 'null' Subscription to the 
subscription variable and only then recreate the Session and 
SubscriptionManager, then finally reassign the variable with the real 

For an actual fix, perhaps a destructor in SubscriptionManagerImpl that 
calls cancelDiversion() on all its Subscription instances would suffice(?).

> I was wondering if you were aware of this sort of issue, and if so, if
> there were plans to resolve it or ideas on how to resolve it.

I wasn't aware of this specific issue. We've been encouraging people to 
use the newer messaging API instead of this older client API. The 
messaging API offers a cleaner, higher level abstraction that makes 
migration to newer versions of the protocol simpler and also makes it 
simpler to provide richer functionality behind the API (such as 

Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org

View raw message