qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Trufanow <alexandre.trufanow...@gmail.com>
Subject [QPID] Deadlock in unit tests on solaris
Date Fri, 08 Apr 2016 09:58:42 GMT
Hi,

I am trying to run QPID on solaris using sun studio and have managed to get
the broker to compile with a few minor fixes. Unfortunately many unit tests
are blocking.

The issue is a deadlock when SessionFixture is created. On the main thread,
the thread is blocked on a DispatchHandler during the call to newSession

=>[5] qpid::sys::Mutex::lock(this = <value unavailable>) (optimized), at
0xfffffd7ffdb81a0e (line ~116) in "Mutex.h"
  [6] qpid::sys::ScopedLock<qpid::sys::Mutex>::ScopedLock(this = <value
unavailable>, l = CLASS) (optimized), at 0xfffffd7ffdb819df (line ~33) in
"Mutex.h"
  [7] qpid::sys::DispatchHandle::rewatchWrite(this = 0xb63558) (optimized),
at 0xfffffd7ffdbf4cc0 (line ~109) in "DispatchHandle.cpp"
  [8] qpid::sys::posix::AsynchIO::notifyPendingWrite(this = <value
unavailable>) (optimized), at 0xfffffd7ffdb62824 (line ~389) in
"AsynchIO.cpp"
  [9] qpid::client::TCPConnector::handle(this = 0xb60fe0, frame = CLASS)
(optimized), at 0xfffffd7ffdf6dc1d (line ~209) in "TCPConnector.cpp"
[... shortened output]
  [22] qpid::client::Connection::newSession(this = <value unavailable>,
name = CLASS, timeout = 0) (optimized), at 0xfffffd7ffdf05b15 (line ~141)
in "Connection.cpp"
  [23]
qpid::tests::SessionFixtureT<qpid::tests::LocalConnection,qpid::client::Session_0_10>::SessionFixtureT(this
= 0xfffffd7fffdfe3d0, opts = STRUCT) (optimized), at 0x5d95b5 (line ~141)
in "BrokerFixture.h"

The lock is also held by one of two Poller threads which is waiting on poll

=>[4] qpid::sys::PollerPrivate::EventStream::getEvent(this = 0xb60ee8,
targetTimeout = CLASS) (optimized), at 0xfffffd7ffdb875cf (line ~466) in
"PosixPoller.cpp"
  [5] qpid::sys::PollerPrivate::EventStream::next(this = 0xb60ee8, timeout
= CLASS) (optimized), at 0xfffffd7ffdb86127 (line ~354) in "PosixPoller.cpp"
  [6] qpid::sys::Poller::wait(this = 0xb467f0, timeout = CLASS)
(optimized), at 0xfffffd7ffdb847c6 (line ~729) in "PosixPoller.cpp"
  [7] qpid::sys::Poller::run(this = 0xb467f0) (optimized), at
0xfffffd7ffdb84540 (line ~690) in "PosixPoller.cpp"

I do not understand how the same lock can be held simultaneously on both
threads but the deadlock is due to the fact that the call to poll will
never wake. I have noticied a suspicious comment on the main thread which
may be linked to this behavior. In TCPConnector::handle, there is the
following comment before the blocking call to AsynchIO.

    /*
      NOTE: Moving the following line into this mutex block
            is a workaround for BZ 570168, in which the test
            testConcurrentSenders causes a hang about 1.5%
            of the time.  ( To see the hang much more frequently
            leave this line out of the mutex block, and put a
            small usleep just before it.)

            TODO mgoulish - fix the underlying cause and then
                            move this call back outside the mutex.
    */
    if (notifyWrite && !closed) aio->notifyPendingWrite();

Do you have any hints what the underlying issue could be ?
Thanks,

Alexandre Trufanow
www.murex.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message