activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gali Shvidky (JIRA)" <j...@apache.org>
Subject [jira] Reopened: (AMQCPP-184) TransportFilter::fire() crashes after accessing a dangling pointer during exception in ActiveMQConnectionFactory::createConnection()
Date Sun, 21 Sep 2008 12:02:52 GMT

     [ https://issues.apache.org/activemq/browse/AMQCPP-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gali Shvidky reopened AMQCPP-184:
---------------------------------

    Regression: [Regression]

We experience the issue with
ActiveMQ-cpp-2.1.3 (Linux )
ActiveMQ Broker 5.1 (Linux)

It looks like there is a race condition which causes our application to crash. Following is
the core found (only relevant thread traces provided):

Thread 12 (process 28639):
#0  0x003807a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x008fe0fd in pthread_join () from /lib/tls/libpthread.so.0
#2  0x080bd401 in activemq::concurrent::Thread::join (this=0xb775dcc0) at activemq/concurrent/Thread.cpp:102
#3  0x08096599 in activemq::transport::IOTransport::close (this=0xb775bde0) at activemq/transport/IOTransport.cpp:142
#4  0x080e76a3 in activemq::transport::filters::TcpTransport::close (this=0xb77c9938) at ./activemq/transport/TransportFilter.h:205
#5  0x080e26d9 in activemq::transport::filters::ResponseCorrelator::close (this=0xb77c9e90)
at activemq/transport/filters/ResponseCorrelator.cpp:238
#6  0x080e3659 in ~ResponseCorrelator (this=0xb77c9e90) at activemq/transport/filters/ResponseCorrelator.cpp:60
#7  0x0806ab00 in activemq::core::ActiveMQConnectionFactory::createConnection (url=@0xb775eaec,
username=@0xb775eae4, password=@0xb775eae8, clientId=@0x45fab90) at activemq/core/ActiveMQConnectionFactory.cpp:177
#8  0x0806b166 in activemq::core::ActiveMQConnectionFactory::createConnection (this=0xb775eae0)
at activemq/core/ActiveMQConnectionFactory.cpp:66

Thread 1 (process 32611):
#0  0x003807a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x02a787a5 in raise () from /lib/tls/libc.so.6
#2  0x02a7a361 in abort () from /lib/tls/libc.so.6
#3  0x0085a0bc in ut_onsig_call_now () from /opt/CSCOacs/db/dbsrv/lib32/libdbtasks10_r.so
#4  <signal handler called>
#5  0x003807a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#6  0x02a787a5 in raise () from /lib/tls/libc.so.6
#7  0x02a7a209 in abort () from /lib/tls/libc.so.6
#8  0x01a2514b in __gnu_cxx::__verbose_terminate_handler () from /usr/lib/libstdc++.so.6
#9  0x01a22e61 in __cxa_call_unexpected () from /usr/lib/libstdc++.so.6
#10 0x01a22e96 in std::terminate () from /usr/lib/libstdc++.so.6
#11 0x01a23545 in __cxa_pure_virtual () from /usr/lib/libstdc++.so.6
#12 0x08095ac5 in activemq::transport::TransportFilter::onTransportException (this=0xb77c9e90,
source=0xb77c9938, ex=@0x9ece188) at activemq/transport/TransportFilter.h:74
#13 0x08095ac5 in activemq::transport::TransportFilter::onTransportException (this=0xb77c9938,
source=0xb775bde0, ex=@0x9ece188) at activemq/transport/TransportFilter.h:74
#14 0x0809713c in activemq::transport::IOTransport::run (this=0xb775bde0) at activemq/transport/IOTransport.h:106
#15 0x080bd51d in activemq::concurrent::Thread::runCallback (param=0xb775dcc0) at activemq/concurrent/Thread.cpp:152
#16 0x008fd371 in start_thread () from /lib/tls/libpthread.so.0
#17 0x02b18ffe in clone () from /lib/tls/libc.so.6

The thread 12 is joining thread 1 while destroying the ResponseCorrelator object, while thread
1 is calling the same ResponseCorrelator object's exception callback.

(gdb) thread 1
[Switching to thread 1 (process 32611)]#0  0x003807a2 in _dl_sysinfo_int80 ()
   from /lib/ld-linux.so.2
(gdb) f 12
#12 0x08095ac5 in activemq::transport::TransportFilter::onTransportException (
    this=0xb77c9e90, source=0xb77c9938, ex=@0x9ece188)
    at activemq/transport/TransportFilter.h:74
74      activemq/transport/TransportFilter.h: No such file or directory.
        in activemq/transport/TransportFilter.h
(gdb) set print object on
(gdb) p this
$3 = (activemq::transport::filters::ResponseCorrelator *) 0xb77c9e90

This is the same object which is being destructed in the thread 12:
#6  0x080e3659 in ~ResponseCorrelator (this=0xb77c9e90) at activemq/transport/filters/ResponseCorrelator.cpp:60

The crashed thread and the joined thread are the same:
thread 12: #2  0x080bd401 in activemq::concurrent::Thread::join (this=0xb775dcc0) at activemq/concurrent/Thread.cpp:102
thread 1: #15 0x080bd51d in activemq::concurrent::Thread::runCallback (param=0xb775dcc0) at
activemq/concurrent/Thread.cpp:152

Let's get a closer look at thread 12, the function ActiveMQConnectionFactory::createConnection()
and try to find out why it destroys that ResponseCorrelator:
#7  0x0806ab00 in activemq::core::ActiveMQConnectionFactory::createConnection (url=@0xb775eaec,
username=@0xb775eae4, password=@0xb775eae8, clientId=@0x45fab90) at activemq/core/ActiveMQConnectionFactory.cpp:177
        
        ....
        // Create and Return the new connection object.
        connection = new ActiveMQConnection( connectionData );

        return connection;

    } catch( exceptions::ActiveMQException& ex ) {
        ex.setMark( __FILE__, __LINE__ );

        delete connection;
        delete connector;
        delete transport;   //  <<<<< this is the line 177 >>>>>>>
        delete properties;

        throw ex;
        .....

So it caught an exception and cleans the allocated objects now.

(gdb) f 7
#7  0x0806ab00 in activemq::core::ActiveMQConnectionFactory::createConnection (
    url=@0xb775eaec, username=@0xb775eae4, password=@0xb775eae8,
    clientId=@0x45fab90) at activemq/core/ActiveMQConnectionFactory.cpp:177
177     activemq/core/ActiveMQConnectionFactory.cpp: No such file or directory.
        in activemq/core/ActiveMQConnectionFactory.cpp
(gdb) p ex
$1 = (class activemq::exceptions::ActiveMQException
     &) @0xb77c9898: {<cms::CMSException> = {<> = {<No data fields>},
<No data fields>}, message = {static npos = 4294967295,
    _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>>
= {<No data fields>}, <No data fields>},
      _M_p = 0xb775e134 "activemq::io::SocketOutputStream::write - Connection reset by peer"}},

Yet the object is destroyed before the connection thread is stopped. I think that the connection
thread should be stopped prior cleanup, something like:

    } catch( exceptions::ActiveMQException& ex ) {
        ex.setMark( __FILE__, __LINE__ );

    // ??? stop the thread
    if (transport)
    {
        transport->close();
    }

        delete connection;
        delete connector;
        delete transport;   // <<<< this is the line 177>>>>>>
        delete properties;




> TransportFilter::fire() crashes after accessing a dangling pointer during exception in
ActiveMQConnectionFactory::createConnection()
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQCPP-184
>                 URL: https://issues.apache.org/activemq/browse/AMQCPP-184
>             Project: ActiveMQ C++ Client
>          Issue Type: Bug
>    Affects Versions: 2.1.3
>         Environment: Windows XP/Server 2003
>            Reporter: python
>            Assignee: Timothy Bish
>             Fix For: 2.2.1
>
>
> This problems was seen on:
> Versions:
> ActiveMQ-cpp-2.1.3 (WindowsServer2003/XP)
> ActiveMQ Broker 5.1 (WindowsServer2003)
> This looks similar to issue [AMQCPP-122|https://issues.apache.org/activemq/browse/AMQCPP-122],
which was fixed in 2.1, but I don't see how IOTransport::run() and error handling have been
properly synchronized. 
> Steps to reproduce:
> # Continuously try to reconnect to an activemq broker that has run out of memory.
> # This may eventually produce the crash (could take several hours to produce depending
on frequency of reconnect attempts).
> # Running activemq-cpp through purify can help reproduce this problem more easily. 
> # A "R6025 pure virtual function call" error message may be printed out to the console
when this error happens.
> Backtraces:
> Thread 1:
> {noformat}
> activemq::transport::TransportFilter::fire()  + 0x48 bytes          
> activemq::transport::TransportFilter::fire()  + 0x48 bytes          
> activemq::transport::IOTransport::fire()  + 0x48 bytes                
> activemq::transport::IOTransport::run()  + 0x7f bytes                
> activemq::concurrent::Thread::runCallback()  + 0x45 bytes      
> msvcr80.dll!781329bb()
> {noformat}
> The crash happens on this line:
> 	exceptionListener->onTransportException( this, ex );
> Thread 2:
> {noformat}
> activemq::concurrent::Thread::join()  Line 108       C++
> activemq::transport::IOTransport::close()  Line 143           C++
> activemq::transport::TransportFilter::close()  Line 213       C++
> activemq::transport::filters::TcpTransport::close()  Line 143 + 0xb bytes    C++
> activemq::transport::filters::ResponseCorrelator::close()  Line 241 C++
> activemq::transport::filters::ResponseCorrelator::~ResponseCorrelator()  Line 64   C++
> activemq::transport::filters::ResponseCorrelator::`scalar deleting destructor'()  + 0xf
bytes C++
> activemq::core::ActiveMQConnectionFactory::createConnection(const std::basic_string<char,std::char_traits<char>,std::allocator<char>
> activemq::core::ActiveMQConnectionFactory::createConnection()  Line 66 + 0x3a bytes 
   C++
> {noformat}
> During ActiveMQConnectionFactory::createConnection() an exception is thrown and the transport
object is deleted. Unfortunately,
> while being deleted this object is still being used by Thread#1 (IOTransport::run). 
> I greatly reduced the likelihood of this problem by calling setTransportExceptionListener(NULL)
in TransportFilter's destructor. 
> After doing that, another crash will start to appear (under the same test conditions)
with the following backtrace:
> Thread 1:
> {noformat}
> activemq::connector::openwire::OpenWireCommandReader::readCommand()  Line 71 + 0x1e bytes
   C++
> activemq::transport::IOTransport::run()  Line 166 + 0x19 bytes       C++
> activemq::concurrent::Thread::runCallback(void * param=0x02a750b0)  Line 152 + 0x13 bytes
           C++
> msvcr80d.dll!_callthreadstartex()  Line 348 + 0xf bytes    C
> msvcr80d.dll!_threadstartex(void * ptd=0x02a6b8c0)  Line 331    C
> kernel32.dll!7c80b683() 
> ntdll.dll!7c91b686()     
> {noformat}
> The crash happens on this line:
> 	return openWireFormat->unmarshal( dataInputStream );
> Thread 2:
> {noformat}
> activemq::concurrent::Thread::join()  Line 108       C++
> activemq::transport::IOTransport::close()  Line 143           C++
> activemq::transport::TransportFilter::close()  Line 213       C++
> activemq::transport::filters::TcpTransport::close()  Line 143 + 0xb bytes    C++
> activemq::transport::filters::ResponseCorrelator::close()  Line 241 C++
> activemq::transport::filters::ResponseCorrelator::~ResponseCorrelator()  Line 64   C++
> activemq::transport::filters::ResponseCorrelator::`scalar deleting destructor'()  + 0xf
bytes C++
> activemq::core::ActiveMQConnectionFactory::createConnection(const std::basic_string<char,std::char_traits<char>,std::allocator<char>
> activemq::core::ActiveMQConnectionFactory::createConnection()  Line 66 + 0x3a bytes 
   C++
> {noformat}
> This second problem is similar to the first and seems to be caused when the OpenWireConnector
is deleted before IOTransport::close() is called. Since IOTransport::run() tries to use the
OpenWireConnector (via OpenWireCommandReader::readCommand()), a crash can occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message