activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dejan Bosanac (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (AMQ-2723) VM connection leaks during each attempt to create a network bridge to a non-existent broker.
Date Mon, 10 May 2010 07:11:43 GMT

     [ https://issues.apache.org/activemq/browse/AMQ-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dejan Bosanac resolved AMQ-2723.
--------------------------------

         Assignee: Dejan Bosanac
    Fix Version/s: 5.4.0
       Resolution: Duplicate

Thanks for verifying. Resolving the issue now.

> VM connection leaks during each attempt to create a network bridge to a non-existent
broker.
> --------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2723
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2723
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Transport
>    Affects Versions: 5.3.1
>         Environment: ActiveMQ 5.3.1, Windows XP
>            Reporter: Stirling Chow
>            Assignee: Dejan Bosanac
>             Fix For: 5.4.0
>
>         Attachments: ConnectionLeakTest.java, VMTransport.java, VMTransport.patch
>
>
> Symptom
> ========
> We deployed ActiveMQ in a network of brokers using HTTP as the broker-to-broker transport
and VM as the inter-broker transport.  Each broker uses a SimpleDiscoveryAgent with a list
of HTTP URLs to potential peer brokers -- in many cases this list contains URLs for brokers
that are inactive for long periods of time.  We performed a week-long test with three active
brokers and 5 inactive brokers.  After one week, the active brokers began reporting OutOfMemory
exceptions related to exhaused heap space (384MB max) and they stopped functioning.
> The generated heap dump revealed 100K+ instances of DurableConduitBridge and related
anonymous classes in DemandForwardingBridgeSupport.  Our expectation was that since there
are only three active brokers, there should have been at most three instances of DurableConduitBridge.
 It appeared that each attempt to create a bridge to a non-existent broker was resulting in
leaking instances of DurableConduitBridge et al.
> Unit Test
> =======
> A JUnit test is included with this ticket to demonstrate the issue.
> Cause
> =====
> The leaking references to DirectConduitBridge et al. were due to the accumulation of
VMTransport connections in TransportConnector#connections.  It seemed that each failed attempt
to create a network bridge was resulting in an instance being added to TransportConnector#connections
that was never being removed.  Here's the reason...
> Each time a broker attempts to create a network bridge to another broker, a call is made
to DiscoveryNetworkConnector::onServiceAdd(DiscoveryEvent) by SimpleDiscoveryAgent.  The broker
initiating the connection creates a local and remote transport and then attempts to create
a bridge between them:
>    remoteTransport = TransportFactory.connect(connectUri);
> ...
>    localTransport = createLocalTransport();
> ...
>    try {
>       bridge.start();
>       ...
>     } catch (Exception e) {
>       ServiceSupport.dispose(localTransport);
>       ServiceSupport.dispose(remoteTransport);
>       ...
>    }
> If the remote broker does not exist (as is the case with our environment), bridge.start()
throws an exception which triggers the disposal of the local and remote transports.
> The localTransport is an instance of VMTransport, and its disposal will eventually call
VMTransport#stop():
>     public void stop() throws Exception {
> ...
>                 enqueueValve.turnOff();
>                 if (!disposed) {
>                     started = false;
>                     disposed = true;
> ...
>                 }
>             } finally {
>                 stopping.set(false);
>                 enqueueValve.turnOn();
>             }
> ...
>             // let the peer know that we are disconnecting..
>             try {
>                 oneway(DISCONNECT);
>             } catch (Exception ignore) {
>             }
>         }
>     }
> The DISCONNECT should get processed by the VMTransport#iterate() on the peer side:
>     public boolean iterate() {
> ...
>             if( command == DISCONNECT ) {
>                 tl.onException(new TransportDisposedIOException("Peer (" + peer.toString()
+ ") disposed."));
>             } else {
> ...
> tl is a reference to the TransportListener implemented by TransportConnection  and should
result in a call to TransportConnection#doStop():
>     protected void doStop() throws Exception, InterruptedException {
> ...
>         connector.onStopped(this);
> The call to connector.onStopped(this) is implemented by TransportConnector#onStopped(TransportConnection):
>     public void onStopped(TransportConnection connection) {
>         connections.remove(connection);
>     }
> This removes the connection represented by the local side of the bridge from the connections
array.
> *** HOWEVER *** in VMTransport#stop(), the disposed flag is set to true before the call
to oneway(DISCONNECT); this causes the oneway(DISCONNECT) to fail  because of this code in
VMTransport#oneway(Object):
>     public void oneway(Object command) throws IOException {
>         if (disposed) {
>             throw new TransportDisposedIOException("Transport disposed.");
>         }
> ...
> In other words, the DISCONNECT never makes it to the peer and so is never processed by
TransportConnection (as TransportListener).
> Solution
> =======
> The solution is to send the DISCONNECT before setting the disposed flag to true.  However,
care must be taken to prevent deadlock since VMTransport#stop() acquires a lock on enqueueValve
and VMTransport#oneway(Object) acquires locks on the peer.enqueueValve --- if both peers try
to stop concurrently, they may deadlock on the acquisition of VMTransport#enqueueValve.  To
prevent this deadlock, it is necessary to send the DISCONNECT before the "local" enqueueValve
is acquired --- this may mean sending the DISCONNECT unnecessarily (i.e., even if the client
is already disposed), but this is not a problem since the resulting exception is ignored.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message