activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brett Humphreys (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AMQ-1116) deadlock when shutting down client that is configured with failover=true and is presently disconnected from broker
Date Sun, 29 Mar 2009 00:56:43 GMT

    [ https://issues.apache.org/activemq/browse/AMQ-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=50878#action_50878
] 

Brett Humphreys commented on AMQ-1116:
--------------------------------------

I agree with Jay Repko.  I'm using AMQ 5.1.0, I've got the following JUnit test to try to
reproduce what I'm experiencing:
{noformat}
  @Test(timeout=15000)
  public void testRetryServerConnectionURLWithNoServer() throws Exception
  {
      final String url="tcp://localhost:61620";
      final String clientUrl = "failover://(" + url + ")?maxReconnectAttempts=0";
      ActiveMQConnectionFactory activeMQConnectionFactory = new ActiveMQConnectionFactory(clientUrl);
      activeMQConnectionFactory.setTransportListener(new PrintingTransportListener());
      final Connection conn = activeMQConnectionFactory.createConnection();
      ((ActiveMQConnection)conn).setCloseTimeout(5);
      Thread clientThread=new Thread("AMQTestClientThread")
      {
          public void run()
          {
              try
              {
                  conn.start();
                  Thread.sleep(2000);
                  conn.stop();
                  conn.close();
              }
              catch(Exception e)
              {
                  thrownException = e;
              }
          }
      };

      clientThread.start();
      Thread.sleep(2000);
      System.out.println("Attempting to shutdown");
      conn.close();
      System.out.println("Shutdown, joining.");
      clientThread.join();
      if(thrownException !=null)
      {
          throw thrownException;
      }
  }

{noformat} 

This test always times out after 15 seconds, setCloseTimeout doesn't seem to change any behavior.

Further, when I kill -3 this, I get the following three thread stack traces (The client thread,
the reconnecting thread & the stopping thread:

{noformat} 
"AMQTestClientThread" prio=1 tid=0x00002aaafc0069f0 nid=0x6f69 in Object.wait() [0x000000004101d000..0x000000004101dc80]
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00002aaadf2bc5f8> (a java.lang.Object)
	at org.apache.activemq.transport.failover.FailoverTransport.oneway(FailoverTransport.java:407)
	- locked <0x00002aaadf2bc5f8> (a java.lang.Object)
	at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:40)
	- locked <0x00002aaadf2bc158> (a java.lang.Object)
	at org.apache.activemq.transport.ResponseCorrelator.asyncRequest(ResponseCorrelator.java:74)
	at org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:79)
	at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1195)
	at org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1289)
	- locked <0x00002aaadf2ab250> (a org.apache.activemq.ActiveMQConnection)
	at org.apache.activemq.ActiveMQConnection.start(ActiveMQConnection.java:456)
	at com.kentrox.objectproxy.jms.test.JMSConnectionStateTesting$2.run(JMSConnectionStateTesting.java:235)

"ActiveMQ Task" daemon prio=1 tid=0x00002aaafc02ab80 nid=0x6f68 in Object.wait() [0x0000000040f1c000..0x0000000040f1ce00]
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00002aaadf2bc618> (a java.lang.Object)
	at org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:735)
	- locked <0x00002aaadf2bc618> (a java.lang.Object)
	at org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:114)
	- locked <0x00002aaadf2bc608> (a java.lang.Object)
	at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
	at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
	at java.lang.Thread.run(Thread.java:595)

"pool-1-thread-1" prio=1 tid=0x00002aaafc001670 nid=0x6f67 waiting for monitor entry [0x0000000040e1b000..0x0000000040e1bd80]
	at org.apache.activemq.ActiveMQConnection.checkClosedOrFailed(ActiveMQConnection.java:1258)
	- waiting to lock <0x00002aaadf2ab250> (a org.apache.activemq.ActiveMQConnection)
	at org.apache.activemq.ActiveMQConnection.stop(ActiveMQConnection.java:497)
	at org.apache.activemq.ActiveMQConnection.close(ActiveMQConnection.java:552)
	at com.kentrox.objectproxy.jms.test.JMSConnectionStateTesting.testRetryServerConnectionURLWithNoServer(JMSConnectionStateTesting.java:250)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:59)
	at org.junit.internal.runners.MethodRoadie.runTestMethod(MethodRoadie.java:98)
	at org.junit.internal.runners.MethodRoadie$1$1.call(MethodRoadie.java:55)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
	at java.util.concurrent.FutureTask.run(FutureTask.java:123)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
	at java.lang.Thread.run(Thread.java:595)
{noformat} 

This may be updated in 5.2.0 but this issue suggests it is fixed in 5.1.0 and I don't believe
that to be the case.

> deadlock when shutting down client that is configured with failover=true and is presently
disconnected from broker
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-1116
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1116
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Transport
>    Affects Versions: 4.0.2
>            Reporter: Chris Hofstaedter
>            Assignee: Rob Davies
>             Fix For: 5.1.0
>
>         Attachments: ConnectionPool.java, DemandForwardingBridgeSupport.java, FailoverTransport.java,
FailoverTransport.java, FailoverTransport_AMQ-1116.patch
>
>
> Sounds like a great idea to me :) Please go ahead and submit a jira with the patch and
we'll get it integrated ASAP
> On 1/4/07, Chris Hofstaedter <chrish@nmwco.com> wrote:
> > I ran into this issue as well on <= 4.0.2.  I never tested it against
> > 4.1.    The FailoverTransport doesn't shutdown if it is disconnected at
> > the time of the shutdown.
> >
> > I believe that the root cause is in the 
> > FailoverTransport.oneway(Command
> > command) function which will hold onto the command and keep trying to
> > reconnect until the message is sent or the transport is disposed.
> >
> > I applied a local patch that seems to solve the problem.  Note that 
> > I've only tested the patch against 4.0.2.
> >
> > Specifically, I early return from the oneway function if the command 
> > is ShutdownInfo and the connectedTransport is null.  This seems to 
> > solve the problem in that I get clean shutdowns on the client when 
> > failover is true and the client is presently disconnected.
> >
> > Does this seem like a reasonable patch?  Should I create a JIRA with 
> > this info?
> >
> >
> >
> > -----Original Message-----
> > From: James Strachan [mailto:james.strachan@gmail.com]
> > Sent: Tuesday, December 12, 2006 4:46 AM
> > To: activemq-users@geronimo.apache.org
> > Subject: Re: failover mode and client shutdown
> >
> > There could be a bug relating to closes with the failover transport 
> > possibly, but the ActiveMQConnection does wait up to the closeTimeout 
> > for a close to succeed before shutting down - so you could try reduce 
> > the timeout.
> >
> > http://incubator.apache.org/activemq/maven/activemq-core/apidocs/org/a
> > pa
> > che/activemq/ActiveMQConnection.html#setCloseTimeout(int)
> >
> >
> > On 12/12/06, Keith Irwin <keith.irwin@gmail.com> wrote:
> > > Folks--
> > >
> > > When we have clients running and we take down AMQ (<= 4.1.0), then 
> > > attempt to shutdown the clients with Control-C (rather than kill the 
> > > JVM with a -9), the clients won't shut down.  It's as if a "close" 
> > > on the failover connection never reaches the amq client classes.
> > >
> > > I note that in the 4.1.0 release notes, this issue is referenced, 
> > > and the advice is to set the maxReconnectAttempts (or similar) 
> > > property to something non-zero.
> > >
> > > The problem is that we don't want there to be a max number of 
> > > attempts.  Unless we specifically want to take down the client (say, 
> > > for an apt-get package upgrade), we want it to keep on trying 
> > > forever.
> > >
> > > SO, my question: Is there an architectural reason for not being able 
> > > to close a failover connection if AMQ is down?
> > >
> > > If it isn't impossible due to tradeoffs elsewhere in the code base, 
> > > we might be willing to submit a patch to fix the issue.
> > >
> > > Our only other recourse is to attempt to close the connections in 
> > > separate threads, then timeout those threads after a while and fall 
> > > out the end of the java process.
> > >
> > > For instance:
> > >
> > >   Thread th = new Thread(new Runnable() {
> > >       public void run() {
> > >          connection.close();
> > >       }
> > >    });
> > >    th.start();
> > >
> > >    // give up after 2 seconds
> > >    Thread.currentThread().join(2000);
> > >
> > > I guess this is do-able, but it seems, you know, some how, well,
> > wrong.
> > >
> > > So, is it worth investigating a patch to AMQ?
> > >
> > > Keith
> > >

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message