activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stirling Chow (JIRA)" <j...@apache.org>
Subject [jira] Created: (AMQ-2723) VM connection leaks during each attempt to create a network bridge to a non-existent broker.
Date Wed, 05 May 2010 22:12:36 GMT
VM connection leaks during each attempt to create a network bridge to a non-existent broker.
--------------------------------------------------------------------------------------------

                 Key: AMQ-2723
                 URL: https://issues.apache.org/activemq/browse/AMQ-2723
             Project: ActiveMQ
          Issue Type: Bug
          Components: Transport
    Affects Versions: 5.3.1
         Environment: ActiveMQ 5.3.1, Windows XP
            Reporter: Stirling Chow
         Attachments: ConnectionLeakTest.java

Symptom
========
We deployed ActiveMQ in a network of brokers using HTTP as the broker-to-broker transport
and VM as the inter-broker transport.  Each broker uses a SimpleDiscoveryAgent with a list
of HTTP URLs to potential peer brokers -- in many cases this list contains URLs for brokers
that are inactive for long periods of time.  We performed a week-long test with three active
brokers and 5 inactive brokers.  After one week, the active brokers began reporting OutOfMemory
exceptions related to exhaused heap space (384MB max) and they stopped functioning.

The generated heap dump revealed 100K+ instances of DurableConduitBridge and related anonymous
classes in DemandForwardingBridgeSupport.  Our expectation was that since there are only three
active brokers, there should have been at most three instances of DurableConduitBridge.  It
appeared that each attempt to create a bridge to a non-existent broker was resulting in leaking
instances of DurableConduitBridge et al.

Unit Test
=======
A JUnit test is included with this ticket to demonstrate the issue.

Cause
=====
The leaking references to DirectConduitBridge et al. were due to the accumulation of VMTransport
connections in TransportConnector#connections.  It seemed that each failed attempt to create
a network bridge was resulting in an instance being added to TransportConnector#connections
that was never being removed.  Here's the reason...

Each time a broker attempts to create a network bridge to another broker, a call is made to
DiscoveryNetworkConnector::onServiceAdd(DiscoveryEvent) by SimpleDiscoveryAgent.  The broker
initiating the connection creates a local and remote transport and then attempts to create
a bridge between them:

   remoteTransport = TransportFactory.connect(connectUri);
...
   localTransport = createLocalTransport();
...
   try {
      bridge.start();
      ...
    } catch (Exception e) {
      ServiceSupport.dispose(localTransport);
      ServiceSupport.dispose(remoteTransport);
      ...
   }

If the remote broker does not exist (as is the case with our environment), bridge.start()
throws an exception which triggers the disposal of the local and remote transports.

The localTransport is an instance of VMTransport, and its disposal will eventually call VMTransport#stop():

    public void stop() throws Exception {
...
                enqueueValve.turnOff();
                if (!disposed) {
                    started = false;
                    disposed = true;
...
                }
            } finally {
                stopping.set(false);
                enqueueValve.turnOn();
            }
...
            // let the peer know that we are disconnecting..
            try {
                oneway(DISCONNECT);
            } catch (Exception ignore) {
            }
        }
    }

The DISCONNECT should get processed by the VMTransport#iterate() on the peer side:

    public boolean iterate() {
...
            if( command == DISCONNECT ) {
                tl.onException(new TransportDisposedIOException("Peer (" + peer.toString()
+ ") disposed."));
            } else {
...

tl is a reference to the TransportListener implemented by TransportConnection  and should
result in a call to TransportConnection#doStop():

    protected void doStop() throws Exception, InterruptedException {
...
        connector.onStopped(this);

The call to connector.onStopped(this) is implemented by TransportConnector#onStopped(TransportConnection):

    public void onStopped(TransportConnection connection) {
        connections.remove(connection);
    }

This removes the connection represented by the local side of the bridge from the connections
array.

*** HOWEVER *** in VMTransport#stop(), the disposed flag is set to true before the call to
oneway(DISCONNECT); this causes the oneway(DISCONNECT) to fail  because of this code in VMTransport#oneway(Object):

    public void oneway(Object command) throws IOException {
        if (disposed) {
            throw new TransportDisposedIOException("Transport disposed.");
        }
...

In other words, the DISCONNECT never makes it to the peer and so is never processed by TransportConnection
(as TransportListener).

Solution
=======
The solution is to send the DISCONNECT before setting the disposed flag to true.  However,
care must be taken to prevent deadlock since VMTransport#stop() acquires a lock on enqueueValve
and VMTransport#oneway(Object) acquires locks on the peer.enqueueValve --- if both peers try
to stop concurrently, they may deadlock on the acquisition of VMTransport#enqueueValve.  To
prevent this deadlock, it is necessary to send the DISCONNECT before the "local" enqueueValve
is acquired --- this may mean sending the DISCONNECT unnecessarily (i.e., even if the client
is already disposed), but this is not a problem since the resulting exception is ignored.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message