activemq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Dapor (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMQ-6601) A-MQ with 2 active brokers, shutting down slave runs into dead-lock
Date Wed, 15 Feb 2017 15:41:41 GMT

     [ https://issues.apache.org/jira/browse/AMQ-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Carlo Dapor updated AMQ-6601:
-----------------------------
    Description: 
We have 2 karaf instances configured to be activemq brokers, broker-amq (b1) and broker2-amq
(b2).

They run on the same machine, use KahaDB with file locking.
It does not matter if b1 or b2 is started first, it becomes the master.

The other one, the slave, when shutting down - while the master is running - hits a dead-lock;
it must be `kill -9`'d manually in the end.

We have a classic dead-lock scenario.  I have attached a `jstack` output when the slave broker
is shutting down.

The race is on between thread #20 and thread #17.
Thread #17 is in 
{code:java}
ActiveMQServiceFactory.destroy(ActiveMQServiceFactory.java:173)
{code}

and thread #20 is in
{code:java}
ActiveMQServiceFactory.updated(ActiveMQServiceFactory.java:140)
{code}

{code}
"CM Configuration Updater (ManagedServiceFactory Update: factoryPid=[org.apache.activemq.server])"
#20 daemon prio=5 os_prio=0 tid=0x00007f793c160800 nid=0x7084 waiting on condition [0x00007f799819f000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at java.lang.Thread.sleep(Thread.java:340)
        at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
        at org.apache.activemq.store.SharedFileLocker.doStart(SharedFileLocker.java:83)
        at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)
        at org.apache.activemq.broker.LockableServiceSupport.preStart(LockableServiceSupport.java:94)
        at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:54)
        at org.apache.activemq.broker.BrokerService.doStartPersistenceAdapter(BrokerService.java:674)
        at org.apache.activemq.broker.BrokerService.startPersistenceAdapter(BrokerService.java:658)
        at org.apache.activemq.broker.BrokerService.start(BrokerService.java:622)
        at org.apache.activemq.osgi.ActiveMQServiceFactory.updated(ActiveMQServiceFactory.java:140)
        - locked <0x000000072bd74db0> (a org.apache.activemq.osgi.ActiveMQServiceFactory)
        at Proxy8890d2d1_e3a3_4b71_a7a0_88810df56856.updated(Unknown Source)
        at org.apache.felix.cm.impl.helper.ManagedServiceFactoryTracker.updated(ManagedServiceFactoryTracker.java:159)
        at org.apache.felix.cm.impl.helper.ManagedServiceFactoryTracker.provideConfiguration(ManagedServiceFactoryTracker.java:93)
        at org.apache.felix.cm.impl.ConfigurationManager$ManagedServiceFactoryUpdate.provide(ConfigurationManager.java:1597)
        at org.apache.felix.cm.impl.ConfigurationManager$ManagedServiceFactoryUpdate.run(ConfigurationManager.java:1540)
        at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
        at java.lang.Thread.run(Thread.java:745)
 
"Thread-4" #19 daemon prio=5 os_prio=0 tid=0x00007f7940002800 nid=0x7081 runnable [0x00007f79984b4000]
   java.lang.Thread.State: RUNNABLE
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
        at java.net.ServerSocket.implAccept(ServerSocket.java:545)
        at java.net.ServerSocket.accept(ServerSocket.java:513)
        at org.apache.karaf.main.ShutdownSocketThread.run(ShutdownSocketThread.java:56)
 
"Thread-3" #18 prio=5 os_prio=0 tid=0x00007f79d0c48800 nid=0x7080 waiting on condition [0x00007f79985b5000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.karaf.main.Main.doMonitor(Main.java:299)
        at org.apache.karaf.main.Main.access$100(Main.java:65)
        at org.apache.karaf.main.Main$1.run(Main.java:275)
 
"FelixStartLevel" #17 daemon prio=5 os_prio=0 tid=0x00007f79d0c48000 nid=0x707f waiting for
monitor entry [0x00007f79986b5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.activemq.osgi.ActiveMQServiceFactory.destroy(ActiveMQServiceFactory.java:173)
        - waiting to lock <0x000000072bd74db0> (a org.apache.activemq.osgi.ActiveMQServiceFactory)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299)
        at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980)
        at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887)
        at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329)
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:765)
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:964)
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:909)
        at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346)
        at org.apache.aries.blueprint.container.BlueprintExtender.modifiedBundle(BlueprintExtender.java:238)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:500)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:433)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$AbstractTracked.track(BundleHookBundleTracker.java:725)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.bundleChanged(BundleHookBundleTracker.java:463)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$BundleEventHook.event(BundleHookBundleTracker.java:422)
        at org.apache.felix.framework.util.SecureAction.invokeBundleEventHook(SecureAction.java:1103)
        at org.apache.felix.framework.util.EventDispatcher.createWhitelistFromHooks(EventDispatcher.java:695)
        at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:483)
        at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4403)
        at org.apache.felix.framework.Felix.stopBundle(Felix.java:2520)
        at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1309)
        at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
        at java.lang.Thread.run(Thread.java:745)
{code}


  was:
We have 2 karaf instances configured to be activemq brokers, broker-amq (b1) and broker2-amq
(b2).

They run on the same machine, use KahaDB with file locking.
It does not matter if b1 or b2 is started first, it becomes the master.

The other one, the slave, when shutting down - while the master is running - takes forever
to shutdown; it must be `kill -9`'d in the end.

We have a classic dead-lock scenario.  I have attached a `jstack` output when the slave broker
is shutting down.

The race is on between thread #20 and thread #17.
Thread #17 is in 
{code:java}
ActiveMQServiceFactory.destroy(ActiveMQServiceFactory.java:173)
{code}

and thread #20 is in
{code:java}
ActiveMQServiceFactory.updated(ActiveMQServiceFactory.java:140)
{code}

{code}
"CM Configuration Updater (ManagedServiceFactory Update: factoryPid=[org.apache.activemq.server])"
#20 daemon prio=5 os_prio=0 tid=0x00007f793c160800 nid=0x7084 waiting on condition [0x00007f799819f000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at java.lang.Thread.sleep(Thread.java:340)
        at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
        at org.apache.activemq.store.SharedFileLocker.doStart(SharedFileLocker.java:83)
        at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)
        at org.apache.activemq.broker.LockableServiceSupport.preStart(LockableServiceSupport.java:94)
        at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:54)
        at org.apache.activemq.broker.BrokerService.doStartPersistenceAdapter(BrokerService.java:674)
        at org.apache.activemq.broker.BrokerService.startPersistenceAdapter(BrokerService.java:658)
        at org.apache.activemq.broker.BrokerService.start(BrokerService.java:622)
        at org.apache.activemq.osgi.ActiveMQServiceFactory.updated(ActiveMQServiceFactory.java:140)
        - locked <0x000000072bd74db0> (a org.apache.activemq.osgi.ActiveMQServiceFactory)
        at Proxy8890d2d1_e3a3_4b71_a7a0_88810df56856.updated(Unknown Source)
        at org.apache.felix.cm.impl.helper.ManagedServiceFactoryTracker.updated(ManagedServiceFactoryTracker.java:159)
        at org.apache.felix.cm.impl.helper.ManagedServiceFactoryTracker.provideConfiguration(ManagedServiceFactoryTracker.java:93)
        at org.apache.felix.cm.impl.ConfigurationManager$ManagedServiceFactoryUpdate.provide(ConfigurationManager.java:1597)
        at org.apache.felix.cm.impl.ConfigurationManager$ManagedServiceFactoryUpdate.run(ConfigurationManager.java:1540)
        at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
        at java.lang.Thread.run(Thread.java:745)
 
"Thread-4" #19 daemon prio=5 os_prio=0 tid=0x00007f7940002800 nid=0x7081 runnable [0x00007f79984b4000]
   java.lang.Thread.State: RUNNABLE
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
        at java.net.ServerSocket.implAccept(ServerSocket.java:545)
        at java.net.ServerSocket.accept(ServerSocket.java:513)
        at org.apache.karaf.main.ShutdownSocketThread.run(ShutdownSocketThread.java:56)
 
"Thread-3" #18 prio=5 os_prio=0 tid=0x00007f79d0c48800 nid=0x7080 waiting on condition [0x00007f79985b5000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.karaf.main.Main.doMonitor(Main.java:299)
        at org.apache.karaf.main.Main.access$100(Main.java:65)
        at org.apache.karaf.main.Main$1.run(Main.java:275)
 
"FelixStartLevel" #17 daemon prio=5 os_prio=0 tid=0x00007f79d0c48000 nid=0x707f waiting for
monitor entry [0x00007f79986b5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.activemq.osgi.ActiveMQServiceFactory.destroy(ActiveMQServiceFactory.java:173)
        - waiting to lock <0x000000072bd74db0> (a org.apache.activemq.osgi.ActiveMQServiceFactory)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299)
        at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980)
        at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887)
        at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329)
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:765)
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:964)
        at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:909)
        at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346)
        at org.apache.aries.blueprint.container.BlueprintExtender.modifiedBundle(BlueprintExtender.java:238)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:500)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:433)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$AbstractTracked.track(BundleHookBundleTracker.java:725)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.bundleChanged(BundleHookBundleTracker.java:463)
        at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$BundleEventHook.event(BundleHookBundleTracker.java:422)
        at org.apache.felix.framework.util.SecureAction.invokeBundleEventHook(SecureAction.java:1103)
        at org.apache.felix.framework.util.EventDispatcher.createWhitelistFromHooks(EventDispatcher.java:695)
        at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:483)
        at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4403)
        at org.apache.felix.framework.Felix.stopBundle(Felix.java:2520)
        at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1309)
        at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
        at java.lang.Thread.run(Thread.java:745)
{code}



> A-MQ with 2 active brokers, shutting down slave runs into dead-lock
> -------------------------------------------------------------------
>
>                 Key: AMQ-6601
>                 URL: https://issues.apache.org/jira/browse/AMQ-6601
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: AMQP
>    Affects Versions: 5.14.0
>            Reporter: Carlo Dapor
>
> We have 2 karaf instances configured to be activemq brokers, broker-amq (b1) and broker2-amq
(b2).
> They run on the same machine, use KahaDB with file locking.
> It does not matter if b1 or b2 is started first, it becomes the master.
> The other one, the slave, when shutting down - while the master is running - hits a dead-lock;
it must be `kill -9`'d manually in the end.
> We have a classic dead-lock scenario.  I have attached a `jstack` output when the slave
broker is shutting down.
> The race is on between thread #20 and thread #17.
> Thread #17 is in 
> {code:java}
> ActiveMQServiceFactory.destroy(ActiveMQServiceFactory.java:173)
> {code}
> and thread #20 is in
> {code:java}
> ActiveMQServiceFactory.updated(ActiveMQServiceFactory.java:140)
> {code}
> {code}
> "CM Configuration Updater (ManagedServiceFactory Update: factoryPid=[org.apache.activemq.server])"
#20 daemon prio=5 os_prio=0 tid=0x00007f793c160800 nid=0x7084 waiting on condition [0x00007f799819f000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at java.lang.Thread.sleep(Thread.java:340)
>         at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
>         at org.apache.activemq.store.SharedFileLocker.doStart(SharedFileLocker.java:83)
>         at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)
>         at org.apache.activemq.broker.LockableServiceSupport.preStart(LockableServiceSupport.java:94)
>         at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:54)
>         at org.apache.activemq.broker.BrokerService.doStartPersistenceAdapter(BrokerService.java:674)
>         at org.apache.activemq.broker.BrokerService.startPersistenceAdapter(BrokerService.java:658)
>         at org.apache.activemq.broker.BrokerService.start(BrokerService.java:622)
>         at org.apache.activemq.osgi.ActiveMQServiceFactory.updated(ActiveMQServiceFactory.java:140)
>         - locked <0x000000072bd74db0> (a org.apache.activemq.osgi.ActiveMQServiceFactory)
>         at Proxy8890d2d1_e3a3_4b71_a7a0_88810df56856.updated(Unknown Source)
>         at org.apache.felix.cm.impl.helper.ManagedServiceFactoryTracker.updated(ManagedServiceFactoryTracker.java:159)
>         at org.apache.felix.cm.impl.helper.ManagedServiceFactoryTracker.provideConfiguration(ManagedServiceFactoryTracker.java:93)
>         at org.apache.felix.cm.impl.ConfigurationManager$ManagedServiceFactoryUpdate.provide(ConfigurationManager.java:1597)
>         at org.apache.felix.cm.impl.ConfigurationManager$ManagedServiceFactoryUpdate.run(ConfigurationManager.java:1540)
>         at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
>         at java.lang.Thread.run(Thread.java:745)
>  
> "Thread-4" #19 daemon prio=5 os_prio=0 tid=0x00007f7940002800 nid=0x7081 runnable [0x00007f79984b4000]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>         at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
>         at java.net.ServerSocket.implAccept(ServerSocket.java:545)
>         at java.net.ServerSocket.accept(ServerSocket.java:513)
>         at org.apache.karaf.main.ShutdownSocketThread.run(ShutdownSocketThread.java:56)
>  
> "Thread-3" #18 prio=5 os_prio=0 tid=0x00007f79d0c48800 nid=0x7080 waiting on condition
[0x00007f79985b5000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.karaf.main.Main.doMonitor(Main.java:299)
>         at org.apache.karaf.main.Main.access$100(Main.java:65)
>         at org.apache.karaf.main.Main$1.run(Main.java:275)
>  
> "FelixStartLevel" #17 daemon prio=5 os_prio=0 tid=0x00007f79d0c48000 nid=0x707f waiting
for monitor entry [0x00007f79986b5000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.activemq.osgi.ActiveMQServiceFactory.destroy(ActiveMQServiceFactory.java:173)
>         - waiting to lock <0x000000072bd74db0> (a org.apache.activemq.osgi.ActiveMQServiceFactory)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:299)
>         at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:980)
>         at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:887)
>         at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:329)
>         at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:765)
>         at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:964)
>         at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:909)
>         at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:325)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346)
>         at org.apache.aries.blueprint.container.BlueprintExtender.modifiedBundle(BlueprintExtender.java:238)
>         at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:500)
>         at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:433)
>         at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$AbstractTracked.track(BundleHookBundleTracker.java:725)
>         at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.bundleChanged(BundleHookBundleTracker.java:463)
>         at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$BundleEventHook.event(BundleHookBundleTracker.java:422)
>         at org.apache.felix.framework.util.SecureAction.invokeBundleEventHook(SecureAction.java:1103)
>         at org.apache.felix.framework.util.EventDispatcher.createWhitelistFromHooks(EventDispatcher.java:695)
>         at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:483)
>         at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4403)
>         at org.apache.felix.framework.Felix.stopBundle(Felix.java:2520)
>         at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1309)
>         at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message