activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Bish (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQCPP-376) Deadlock in IOTransport when network of brokers restart and failover is used.
Date Sat, 09 Jul 2011 22:29:59 GMT

    [ https://issues.apache.org/jira/browse/AMQCPP-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062650#comment-13062650
] 

Timothy Bish commented on AMQCPP-376:
-------------------------------------

I think I see what the cause is, its a bit tricky figuring out how its tying itself in this
particular knot though.  You could try the code in trunk as the threading model has changed
but its still not quite release ready.

I need to think on this one for a bit to come up with the right fix.

> Deadlock in IOTransport when network of brokers restart and failover is used. 
> ------------------------------------------------------------------------------
>
>                 Key: AMQCPP-376
>                 URL: https://issues.apache.org/jira/browse/AMQCPP-376
>             Project: ActiveMQ C++ Client
>          Issue Type: Bug
>          Components: Other C++ Clients
>    Affects Versions: 3.4.0
>         Environment: ActiveMQ-CPP  ver - 3.4.0
> Broker  5.3.1
> Machine: Linux mars 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64
x86_64 GNU/Linux
> gcc version: 4.1.2 20080704 (Red Hat 4.1.2-44))
>            Reporter: igor khaustov
>            Assignee: Timothy Bish
>         Attachments: bt_1.txt, bt_2.txt
>
>
> The problem description:
> We  run Network of brokers ( 4 in number ) . 
> Broker URI : broker URI 'failover://(tcp://10.10.13.20:61616,tcp://10.10.13.22:61616,tcp://10.10.13.24:61616,tcp://10.10.13.26:61616)?randomize=true&connection.closeTimeout=10000&transport.soTimeout=3000&timeout=3000&connection.useAsyncSend=true&connection.alwaysSyncSend=false'
> Producer loads broker with 1000 message/sec . We testing the producer behavior while
failover by  restarting all brokers in row ( all 4 ) while sending the messages and get deadlock
as shown below .
> Note: the problem tested only with network on brokers .
> The backtrace ( only relevant threads ):
> +Thread 16 (process 26892):+
> *#0  0x00000032ef00ce74 in __lll_lock_wait () from /lib64/libpthread.so.0*
> #1  0x00000032ef008874 in _L_lock_106 () from /lib64/libpthread.so.0
> #2  0x00000032ef0082e0 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x0000000000dc5a04 in decaf::internal::util::concurrent::MutexImpl::lock (handle=0xfefdd38)
at decaf/internal/util/concurrent/unix/MutexImpl.cpp:77
> #4  0x0000000000bd9092 in decaf::util::concurrent::Mutex::lock (this=0xff54100) at decaf/util/concurrent/Mutex.cpp:111
> #5  0x0000000000d51f3f in decaf::util::AbstractCollection<decaf::lang::Pointer<activemq::transport::Transport,
decaf::util::concurrent::atomic::AtomicRefCounter> >::lock (this=0xff540f8) at ./decaf/util/AbstractCollection.h:331
> #6  0x0000000000bd86c9 in decaf::util::concurrent::Lock::lock (this=0x4c7b9c90) at decaf/util/concurrent/Lock.cpp:54
> #7  0x0000000000bd883a in Lock (this=0x4c7b9c90, object=0xff54188, intiallyLocked=true)
at decaf/util/concurrent/Lock.cpp:32
> *#8  0x0000000000d47a77 in activemq::transport::failover::CloseTransportsTask::add (this=0xff540e8,
transport=@0x4c7b9cf0) at activemq/transport/failover/CloseTransportsTask.cpp:46*
> #9  0x0000000000b1b748 in activemq::transport::failover::FailoverTransport::handleTransportFailure
(this=0xffed498, error=@0x4c7b9ee0) at activemq/transport/failover/FailoverTransport.cpp:483
> #10 0x0000000000b41a06 in activemq::transport::failover::FailoverTransportListener::onException
(this=0xfde2e58, ex=@0x4c7b9ee0) at activemq/transport/failover/FailoverTransportListener.cpp:76
> #11 0x0000000000d34813 in activemq::transport::TransportFilter::fire (this=0x10627498,
ex=@0x4c7b9ee0) at activemq/transport/TransportFilter.cpp:54
> #12 0x0000000000d34841 in activemq::transport::TransportFilter::onException (this=0x10627498,
ex=@0x4c7b9ee0) at activemq/transport/TransportFilter.cpp:46
> #13 0x0000000000d34813 in activemq::transport::TransportFilter::fire (this=0xfeeb558,
ex=@0x4c7b9ee0) at activemq/transport/TransportFilter.cpp:54
> #14 0x0000000000d34841 in activemq::transport::TransportFilter::onException (this=0xfeeb558,
ex=@0x4c7b9ee0) at activemq/transport/TransportFilter.cpp:46
> #15 0x0000000000d554c8 in activemq::transport::inactivity::InactivityMonitor::onException
(this=0xfeeb558, ex=@0x4c7b9ee0) at activemq/transport/inactivity/InactivityMonitor.cpp:312
> #16 0x0000000000d34813 in activemq::transport::TransportFilter::fire (this=0x1020c118,
ex=@0x4c7b9ee0) at activemq/transport/TransportFilter.cpp:54
> #17 0x0000000000d34841 in activemq::transport::TransportFilter::onException (this=0x1020c118,
ex=@0x4c7b9ee0) at activemq/transport/TransportFilter.cpp:46
> #18 0x0000000000d326f2 in activemq::transport::IOTransport::fire (this=0xdce10b8, ex=@0x4c7b9ee0)
at activemq/transport/IOTransport.cpp:87
> #19 0x0000000000d32982 in activemq::transport::IOTransport::run (this=0xdce10b8) at activemq/transport/IOTransport.cpp:264
> #20 0x0000000000baad49 in decaf::lang::ThreadProperties::runCallback (properties=0x105871d8)
at decaf/lang/Thread.cpp:137
> #21 0x0000000000ba9068 in threadWorker (arg=0x105871d8) at decaf/lang/Thread.cpp:190
> #22 0x00000032ef006367 in start_thread () from /lib64/libpthread.so.0
> #23 0x00000032ee4d30ad in clone () from /lib64/libc.so.6
> +Thread 9 (process 14470):+
> *#0  0x00000032ef00a899 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0*
> #1  0x0000000000dc54b3 in decaf::internal::util::concurrent::ConditionImpl::wait (condition=0x1072d2b8)
at decaf/internal/util/concurrent/unix/ConditionImpl.cpp:101
> #2  0x0000000000bd9033 in decaf::util::concurrent::Mutex::wait (this=0x105871d8) at decaf/util/concurrent/Mutex.cpp:126
> #3  0x0000000000ba8538 in decaf::lang::Thread::join (this=0x12a4a418) at decaf/lang/Thread.cpp:452
> #4  0x0000000000d32c28 in activemq::transport::IOTransport::close (this=0xdce10b8) at
activemq/transport/IOTransport.cpp:222
> #5  0x0000000000d34bfe in activemq::transport::TransportFilter::close (this=0x1020c118)
at activemq/transport/TransportFilter.cpp:106
> #6  0x0000000000b47d3a in activemq::transport::tcp::TcpTransport::close (this=0x1020c118)
at activemq/transport/tcp/TcpTransport.cpp:74
> #7  0x0000000000d34bfe in activemq::transport::TransportFilter::close (this=0xfeeb558)
at activemq/transport/TransportFilter.cpp:106
> #8  0x0000000000d554ec in activemq::transport::inactivity::InactivityMonitor::close (this=0xfeeb558)
at activemq/transport/inactivity/InactivityMonitor.cpp:300
> #9  0x0000000000d77867 in activemq::wireformat::openwire::OpenWireFormatNegotiator::close
(this=0x10627498) at activemq/wireformat/openwire/OpenWireFormatNegotiator.cpp:248
> *#10 0x0000000000d478ff in activemq::transport::failover::CloseTransportsTask::iterate
(this=0xff540e8) at activemq/transport/failover/CloseTransportsTask.cpp:75*
> #11 0x0000000000d25891 in activemq::threads::CompositeTaskRunner::iterate (this=0xddc0108)
at activemq/threads/CompositeTaskRunner.cpp:173
> #12 0x0000000000d25ae4 in activemq::threads::CompositeTaskRunner::run (this=0xddc0108)
at activemq/threads/CompositeTaskRunner.cpp:107
> #13 0x0000000000baad49 in decaf::lang::ThreadProperties::runCallback (properties=0xfeeb2b8)
at decaf/lang/Thread.cpp:137
> #14 0x0000000000ba9068 in threadWorker (arg=0xfeeb2b8) at decaf/lang/Thread.cpp:190
> #15 0x00000032ef006367 in start_thread () from /lib64/libpthread.so.0
> #16 0x00000032ee4d30ad in clone () from /lib64/libc.so.6
> As you can see +Thread 16+ is on lock_wait for *_synchronized( &transports )_* in
activemq::transport::failover::CloseTransportsTask::add .
> The *_synchronized( &transports )_* in locked by +Thread 9+ in activemq::threads::CompositeTaskRunner::iterate
. But  +Thread 9+ is on pthread_cond_wait which has to be signalled by the +Thread 16+.
> Kind regards .
> Igor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message