activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pero Atanasov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMQ-5508) Broker shutdown race condition leads to "IllegalStateException: Timer already cancelled"
Date Tue, 06 Jan 2015 18:12:34 GMT
Pero Atanasov created AMQ-5508:
----------------------------------

             Summary: Broker shutdown race condition leads to "IllegalStateException: Timer
already cancelled"
                 Key: AMQ-5508
                 URL: https://issues.apache.org/jira/browse/AMQ-5508
             Project: ActiveMQ
          Issue Type: Bug
          Components: Broker
    Affects Versions: 5.10.0
            Reporter: Pero Atanasov
            Priority: Minor


This issue is seen on the surface with the broker failing to remove a consumer:

2014-12-19 10:45:42,586 | WARN  | Failed to remove consumer:
ID:hack-34927-1419007505006-2:3:0:0 |
org.apache.activemq.broker.TransportConnection | ActiveMQ
BrokerService[broker0] Task-6
java.lang.IllegalStateException: Timer already cancelled.

The full stack trace [STACK_TRACE] will be specified at the end of this report.

In this case, the race condition is exposed via some of the AdvisoryBroker code while trying
to send an advisory message as part of the consumer/producer addition/removal code. Below
are excerpts of code to explain this race condition:

activemq-broker/src/main/java/org/apache/activemq/advisory/AdvisoryBroker.java

Lines 605 - 637 [in fireAdvisory(...)]
if (getBrokerService().isStarted()) {
     // code to create and populate an advisory message
     next.send(producerExchange, advisoryMessage);
     // send will trigger destination creation/addition for this advisory message topic
}

However, it can easily happen that a client (consumer or producer) connects and is added _before_
getBrokerService().isStarted() is true, so in that case, the boolean expression in the block
above will evaluate to false and hence skip the entire advisory message send and create/add
destination process. At this point, the consumer/producer destination does not have its advisory
message destination created. If no other consumer/producer connects on the same destination
until the broker is shutting down, then no advisory destination pairing will be created for
that consumer/producer. When the broker starts the shutdown process, the event that will trigger
an advisory message will be the removal of a consumer/producer as part of the TransportConnection
stopping process. The sequence of calls with line numbers can be seen within the STACK_TRACE.

Back to the above specified code block, when the broker is shutting down, the broker is well
started. Then, the boolean expression will evaluate to true and hence proceed with the advisory
message creation/sending and destination cration/addition process. However, in the meantime,
as part of the shutdown process, the Scheduler timer is being cancelled. Consider the following
blocks of code:

activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java

Lines 756 - 761 [in stop()]
// for each service
     stopper.stop(service);

activemq-client/src/main/java/org/apache/activemq/util/ServiceSupport.java

Lines 67 - 84 [in stop()]
if (stopped.compareAndSet(false, true)) {
    ... some code ...
    doStop(stopper);
    ... some code ...    
}

activemq-client/src/main/java/org/apache/activemq/thread/Scheduler.java

Lines 69 - 71 [in doStop(...)]

if (this.timer != null) {
     this.timer.cancel();
}

At this point, the timer is cancelled. Now, back to the thread removing the consumer. From
the STACK_TRACE, it can be seen that after many calls related to the advisory message destination
creation, it ends up in AbstractRegion’s addDestination method.

activemq-broker/src/main/java/org/apache/activemq/broker/region/AbstractRegion.java

Lines 132 - 145 [in addDestination(...)]
if (dest == null) {
     ... some code ...
     dest = createDestination(context, destination);
     ... some code ...
     dest.start();  
     ... some code ...
}

As it can be seen in the STACK_TRACE below, dest.start() call will trigger the need for a
timer to be scheduled on the same timer that has already been cancelled. Some debug prints
to confirm the race condition:

1. Fail to send advisory message for this consumer advisory topic because getBrokerService().isStarted()
is false:

2015-01-06 08:49:26,215 | INFO  | patanasov: from AdvisoryBroker fireAdvisory would have sent
advisoryMessage: topic ActiveMQ.Advisory.Consumer.Queue.2015.01.06-08.23.00.16401 | org.apache.activemq.advisory.AdvisoryBroker

2. Broker shutting down and cancelling timer:

2015-01-06 08:49:43,602 | INFO  | Apache ActiveMQ 5.10.0 (broker0, ID:host-40197-1420555763149-0:1)
is shutting down | org.apache.activemq.broker.BrokerService | ActiveMQ ShutdownHook
2015-01-06 08:49:43,603 | INFO  | patanasov: from ServiceSupport calling doStop(stopper) |
org.apache.activemq.util.ServiceSupport | ActiveMQ ShutdownHook
2015-01-06 08:49:43,603 | INFO  | patanasov: from Scheduler doStop(ServiceStopper stopper)
calling this.timer.cancel(): timer java.util.Timer@7df33bb0 | org.apache.activemq.thread.Scheduler
| ActiveMQ ShutdownHook

3. Adding destination and scheduling on cancelled timer (7df33bb0):

2015-01-06 08:49:43,767 | INFO  | patanasov: broker0 adding destination: qualified: topic://ActiveMQ.Advisory.Consumer.Queue.2015.01.06-08.23.00.16401,
physical: ActiveMQ.Advisory.Consumer.Queue.2015.01.06-08.23.00.16401 | org.apache.activemq.broker.region.AbstractRegion
| ActiveMQ BrokerService[broker0] Task-12
2015-01-06 08:49:43,782 | INFO  | patanasov: from schedualPeriodically: timer: java.util.Timer@7df33bb0
| org.apache.activemq.thread.Scheduler | ActiveMQ BrokerService[broker0] Task-12
2015-01-06 08:49:43,785 | INFO  | patanasov: caught an exception in schedualPeriodically:
| org.apache.activemq.thread.Scheduler | ActiveMQ BrokerService[broker0] Task-12
java.lang.IllegalStateException: Timer already cancelled.

I will provide a patch to fix this issue. However, the bigger concern here is that advisory
message destination creations fail for clients that connect while getBrokerService().isStarted()
is false. If there is only one consumer/producer on one destination and if it connects while
getBrokerService().isStarted() is false, then it will fail to create/send advisory message
for that destination, at least until the producer/consumer is removed at the end. I will continue
looking to learn more about this state of the broker and open a separate JIRA issue if needed.


STACK_TRACE:
java.lang.IllegalStateException: Timer already cancelled.
    at java.util.Timer.sched(Timer.java:354)
    at java.util.Timer.schedule(Timer.java:222)
    at
org.apache.activemq.thread.Scheduler.schedualPeriodically(Scheduler.java:53)
    at org.apache.activemq.broker.region.Topic.start(Topic.java:563)
    at
org.apache.activemq.broker.region.AbstractRegion.addDestination(AbstractRegion.java:141)
    at
org.apache.activemq.broker.region.RegionBroker.addDestination(RegionBroker.java:325)
    at
org.apache.activemq.broker.BrokerFilter.addDestination(BrokerFilter.java:167)
    at
org.apache.activemq.advisory.AdvisoryBroker.addDestination(AdvisoryBroker.java:184)
    at
org.apache.activemq.broker.BrokerFilter.addDestination(BrokerFilter.java:167)
    at
org.apache.activemq.broker.BrokerFilter.addDestination(BrokerFilter.java:167)
    at
org.apache.activemq.broker.MutableBrokerFilter.addDestination(MutableBrokerFilter.java:172)
    at
org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:439)
    at
org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:297)
    at
org.apache.activemq.advisory.AdvisoryBroker.fireAdvisory(AdvisoryBroker.java:615)
    at
org.apache.activemq.advisory.AdvisoryBroker.fireConsumerAdvisory(AdvisoryBroker.java:564)
    at
org.apache.activemq.advisory.AdvisoryBroker.fireConsumerAdvisory(AdvisoryBroker.java:550)
    at
org.apache.activemq.advisory.AdvisoryBroker.removeConsumer(AdvisoryBroker.java:271)
    at
org.apache.activemq.broker.BrokerFilter.removeConsumer(BrokerFilter.java:132)
    at
org.apache.activemq.broker.BrokerFilter.removeConsumer(BrokerFilter.java:132)
    at
org.apache.activemq.broker.MutableBrokerFilter.removeConsumer(MutableBrokerFilter.java:137)
    at
org.apache.activemq.broker.TransportConnection.processRemoveConsumer(TransportConnection.java:651)
    at
org.apache.activemq.broker.TransportConnection.processRemoveSession(TransportConnection.java:690)
    at
org.apache.activemq.broker.TransportConnection.processRemoveConnection(TransportConnection.java:802)
    at
org.apache.activemq.broker.TransportConnection.doStop(TransportConnection.java:1143)
    at
org.apache.activemq.broker.TransportConnection$4.run(TransportConnection.java:1073)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
    at java.lang.Thread.run(Thread.java:619) 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message