qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Springer <rsprin...@etinternational.com>
Subject Broker death recovery
Date Tue, 03 Jan 2012 20:18:23 GMT
Hi all,
   In our application (we've tried both 0.5 and 0.12), we'd like for our 
client programs to quickly recover in the case where a broker dies. 
Currently, we're able to do this by dynamically allocating all our 
Qpid-using code, and simply re-allocating should the broker die, but 
that's seems inelegant and feels...wrong.
   If we attempt to reconnect and don't create a new Session (i.e., use 
the old one), bad things happen (since Session doesn't yet support 
resume(), I assume that's expected behavior).
   When we then try to create a new Session, a new SubscriptionManager, 
and a new Subscription, we get an assertion failure (backtrace at the 
end of this message).
   After reading the backtrace, I believe the following is happening:
1) In recovery, we attempt to assign a new Subscription to the previous 
Subscription variable (i.e., "sub = subMgr->subscribe()")
2) That causes the refcount for the old Subscription to fall to 0,
causing it to be cleaned up.
3) As part of that cleanup, the associated SubscriptionImpl object
goes to destroy its (std::auto_ptr<ScopedDivert>) demuxRule member.
4) That demuxRule member maintains a reference to a Demux object,
demuxer, which exists inside the Session object.

Thus, we have a fatal circle - we need to create a new Session object to 
be able to proceed, but when we do so, we render ourselves unable to
re-use Subscription variables.

Unfortunately, I can't think of an easy/simple fix, besides perhaps 
adding reference counting to the Demux variable...although I haven't 
thought that through at all.

I was wondering if you were aware of this sort of issue, and if so, if 
there were plans to resolve it or ideas on how to resolve it.

Thanks a ton!

Backtrace (plus a little other GDB output):
Invalid argument
void qpid::sys::Mutex::lock(): Assertion `0' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff665e3a5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff665e3a5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6661b0b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff6656d4d in __assert_fail () from 
#3  0x00007ffff7b37482 in qpid::sys::Mutex::lock (this=0x6697b8)
#4  0x00007ffff7b37ce3 in ScopedLock (this=0x7fffffffdbb0, l=...)
#5  0x00007ffff7b57a80 in qpid::client::Demux::remove (this=0x6697b8, 
#6  0x00007ffff7b5728c in ~ScopedDivert (this=0x6263d0, __in_chrg=<value 
optimized out>)
     at /home/rspringer/qpid_work/qpid-0.12/cpp/src/qpid/client/Demux.cpp:45
#7  0x00007ffff7b75f67 in ~auto_ptr (this=0x66a160, __in_chrg=<value 
optimized out>)
     at /usr/include/c++/4.6/backward/auto_ptr.h:170
#8  0x00007ffff7b77c34 in ~SubscriptionImpl (this=0x66a070, 
__in_chrg=<value optimized out>)
#9  0x00007ffff7b77d82 in ~SubscriptionImpl (this=0x66a070, 
__in_chrg=<value optimized out>)
#10 0x00007ffff7b2c15c in qpid::RefCounted::released (this=0x66a070)
     at /home/rspringer/qpid_work/qpid-0.12/cpp/src/qpid/RefCounted.h:48
#11 0x00007ffff7b380b7 in qpid::RefCounted::release (this=0x66a070)
     at /home/rspringer/qpid_work/qpid-0.12/cpp/src/qpid/RefCounted.h:42
#12 0x00007ffff7b5e2fe in 
boost::intrusive_ptr_release<qpid::client::SubscriptionImpl> (p=0x66a070)
     at /home/rspringer/qpid_work/qpid-0.12/cpp/src/qpid/RefCounted.h:59
#13 0x00007ffff7b74a9c in 
qpid::client::PrivateImplRef<qpid::client::Subscription>::dtor (t=...)
#14 0x00007ffff7b7463a in ~Subscription (this=0x7fffffffde08, 
__in_chrg=<value optimized out>)
#15 0x000000000040c355 in ~CQpidConnection (this=0x7fffffffdd80, 
__in_chrg=<value optimized out>) at qpidConnection.h:71
#16 0x000000000040bec7 in example0 () at myServer.cpp:9
#17 0x000000000040c095 in main () at myServer.cpp:86

Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org

View raw message