Return-Path: X-Original-To: apmail-cxf-issues-archive@www.apache.org Delivered-To: apmail-cxf-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 40A3EFF5E for ; Wed, 3 Apr 2013 08:51:17 +0000 (UTC) Received: (qmail 4040 invoked by uid 500); 3 Apr 2013 08:51:17 -0000 Delivered-To: apmail-cxf-issues-archive@cxf.apache.org Received: (qmail 3964 invoked by uid 500); 3 Apr 2013 08:51:16 -0000 Mailing-List: contact issues-help@cxf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cxf.apache.org Delivered-To: mailing list issues@cxf.apache.org Received: (qmail 3950 invoked by uid 99); 3 Apr 2013 08:51:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Apr 2013 08:51:16 +0000 Date: Wed, 3 Apr 2013 08:51:15 +0000 (UTC) From: "Amichai Rothman (JIRA)" To: issues@cxf.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DOSGI-163) Deadlock when shutting down MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DOSGI-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620745#comment-13620745 ] Amichai Rothman commented on DOSGI-163: --------------------------------------- Another day, another deadlock... here's another one, that also involves ImportRegistrationImpl.close() and occurs one line below the deadlock reported above - they involve a different set of locks, but since they are so adjacent, I think I'm going to retract the proposed patch, and suggest that all synchronization in this seemingly very simple class be reviewed and reworked. Unfortunately, there is no documentation there to explain what the previous authors thought should be synchronized and why (there may have been other issues that required it), but obviously the current full-method synchronization and double-locking (when accessing the external classes) are getting things wrong. In addition to the two deadlocks (and potentially another involving clientServiceFactory in the same code path), there are also accesses to some fields that are not synchronized at all and maybe should be - if someone a bit more familiar with the code and how it is supposed to be used could have a look, that would be great. "System Bundle Shutdown" daemon prio=6 tid=0x20ae5400 nid=0xfdc waiting for monitor entry [0x213cf000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.cxf.dosgi.dsw.service.RemoteServiceAdminCore.removeImportRegistration(RemoteServiceAdminCore.java:399) - waiting to lock <0x0e8160c8> (a java.util.LinkedHashMap) at org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.close(ImportRegistrationImpl.java:101) - locked <0x0d902160> (a org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl) at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.removeServiceInterest(TopologyManagerImport.java:137) at org.apache.cxf.dosgi.topologymanager.importer.ListenerHookImpl.removed(ListenerHookImpl.java:111) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry$10.call(ServiceRegistry.java:1348) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.notifyHookPrivileged(ServiceRegistry.java:1239) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.notifyHooksPrivileged(ServiceRegistry.java:1222) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.notifyListenerHooksPrivileged(ServiceRegistry.java:1342) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.notifyListenerHooks(ServiceRegistry.java:1324) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.removeServiceListener(ServiceRegistry.java:730) at org.eclipse.osgi.framework.internal.core.BundleContextImpl.removeServiceListener(BundleContextImpl.java:258) at org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe.stop(AbstractServiceReferenceRecipe.java:155) - locked <0x0dd2ff08> (a java.lang.Object) at org.apache.aries.blueprint.container.BlueprintContainerImpl.untrackServiceReference(BlueprintContainerImpl.java:613) at org.apache.aries.blueprint.container.BlueprintContainerImpl.untrackServiceReferences(BlueprintContainerImpl.java:593) at org.apache.aries.blueprint.container.BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:906) at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:857) at org.apache.aries.blueprint.container.BlueprintExtender$3.run(BlueprintExtender.java:284) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at org.apache.aries.blueprint.container.BlueprintExtender.destroyContainer(BlueprintExtender.java:305) at org.apache.aries.blueprint.container.BlueprintExtender.stop(BlueprintExtender.java:156) at org.apache.aries.blueprint.container.BlueprintExtender.modifiedBundle(BlueprintExtender.java:199) at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:500) at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:433) at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$AbstractTracked.track(BundleHookBundleTracker.java:725) at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.bundleChanged(BundleHookBundleTracker.java:463) at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$BundleEventHook.event(BundleHookBundleTracker.java:422) at org.eclipse.osgi.framework.internal.core.Framework$10.call(Framework.java:1606) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.notifyHookPrivileged(ServiceRegistry.java:1239) at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.notifyHooksPrivileged(ServiceRegistry.java:1222) at org.eclipse.osgi.framework.internal.core.Framework.notifyEventHooksPrivileged(Framework.java:1603) at org.eclipse.osgi.framework.internal.core.Framework.publishBundleEventPrivileged(Framework.java:1558) at org.eclipse.osgi.framework.internal.core.Framework.publishBundleEvent(Framework.java:1505) at org.eclipse.osgi.framework.internal.core.Framework.publishBundleEvent(Framework.java:1500) at org.eclipse.osgi.framework.internal.core.Framework.shutdown(Framework.java:682) - locked <0x06cb0520> (a org.eclipse.osgi.framework.internal.core.Framework) at org.eclipse.osgi.framework.internal.core.Framework.close(Framework.java:600) - locked <0x06cb0520> (a org.eclipse.osgi.framework.internal.core.Framework) at org.eclipse.osgi.framework.internal.core.InternalSystemBundle$1.run(InternalSystemBundle.java:261) at java.lang.Thread.run(Thread.java:722) "pool-17-thread-1" prio=6 tid=0x21051000 nid=0xc4c waiting for monitor entry [0x2359f000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.instanceAdded(ImportRegistrationImpl.java:84) - waiting to lock <0x0d902160> (a org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl) at org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.(ImportRegistrationImpl.java:75) at org.apache.cxf.dosgi.dsw.service.RemoteServiceAdminCore.importService(RemoteServiceAdminCore.java:264) - locked <0x0e8160c8> (a java.util.LinkedHashMap) at org.apache.cxf.dosgi.dsw.service.RemoteServiceAdminInstance$2.run(RemoteServiceAdminInstance.java:117) at org.apache.cxf.dosgi.dsw.service.RemoteServiceAdminInstance$2.run(RemoteServiceAdminInstance.java:110) at java.security.AccessController.doPrivileged(Native Method) at org.apache.cxf.dosgi.dsw.service.RemoteServiceAdminInstance.importService(RemoteServiceAdminInstance.java:110) at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.importService(TopologyManagerImport.java:282) at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.importServices(TopologyManagerImport.java:253) at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.access$300(TopologyManagerImport.java:53) at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport$2.run(TopologyManagerImport.java:202) - locked <0x0d8086a0> (a java.util.HashMap) - locked <0x0d8086d0> (a java.util.HashMap) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) > Deadlock when shutting down > --------------------------- > > Key: DOSGI-163 > URL: https://issues.apache.org/jira/browse/DOSGI-163 > Project: CXF Distributed OSGi > Issue Type: Bug > Components: DSW > Affects Versions: 1.4.0 > Environment: Oracle JDK 1.7.0_17, Karaf 2.3.1, DOSGi 1.4.0 > Reporter: Amichai Rothman > Attachments: fix_dosgi_unregister_deadlock.diff > > > Karaf sometimes hangs while shutting down, and a thread dump shows a deadlock in DOSGi. In general, the OSGi specs recommend not to hold any locks when calling the APIs since it can result in a deadlock, as is the case here. > A proposed fix calls ServiceRegistration.unregister outside of the synchronized block in ImportRegistrationImpl.close (note that it still is synchronized when called from closeAll, which may pose a related but separate problem - but the fix does fix this specific deadlock as it was encountered here). > Found one Java-level deadlock: > ============================= > "pool-18-thread-3": > waiting to lock monitor 0x00007f0658003f40 (object 0x00000000e5b485b0, a org.eclipse.osgi.internal.serviceregistry.ServiceUse), > which is held by "Blueprint Extender: 2" > "Blueprint Extender: 2": > waiting to lock monitor 0x00007f0654002a50 (object 0x00000000f1b70240, a org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl), > which is held by "pool-18-thread-3" > Java stack information for the threads listed above: > =================================================== > "pool-18-thread-3": > at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.releaseService(ServiceRegistrationImpl.java:562) > - waiting to lock <0x00000000e5b485b0> (a org.eclipse.osgi.internal.serviceregistry.ServiceUse) > at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:245) > at org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.instanceClosed(ImportRegistrationImpl.java:118) > - locked <0x00000000f1b70240> (a org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl) > at org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.close(ImportRegistrationImpl.java:100) > - locked <0x00000000f1b70240> (a org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl) > at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.unexportNotAvailableServices(TopologyManagerImport.java:227) > at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.access$200(TopologyManagerImport.java:53) > at org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport$2.run(TopologyManagerImport.java:201) > - locked <0x00000000ea71b6c0> (a java.util.HashMap) > - locked <0x00000000ea71b6f8> (a java.util.HashMap) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > "Blueprint Extender: 2": > at org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.closeAll(ImportRegistrationImpl.java:127) > - waiting to lock <0x00000000f1b70240> (a org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl) > at org.apache.cxf.dosgi.dsw.service.ClientServiceFactory.remove(ClientServiceFactory.java:110) > at org.apache.cxf.dosgi.dsw.service.ClientServiceFactory.ungetService(ClientServiceFactory.java:104) > - locked <0x00000000f1b6f7e0> (a org.apache.cxf.dosgi.dsw.service.ClientServiceFactory) > at org.eclipse.osgi.internal.serviceregistry.ServiceUse$2.run(ServiceUse.java:238) > at java.security.AccessController.doPrivileged(Native Method) > at org.eclipse.osgi.internal.serviceregistry.ServiceUse.ungetService(ServiceUse.java:236) > at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.ungetService(ServiceRegistrationImpl.java:518) > - locked <0x00000000e5b485b0> (a org.eclipse.osgi.internal.serviceregistry.ServiceUse) > at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.ungetService(ServiceRegistry.java:510) > at org.eclipse.osgi.framework.internal.core.BundleContextImpl.ungetService(BundleContextImpl.java:636) > at org.apache.aries.blueprint.container.ReferenceListRecipe$ServiceDispatcher.destroy(ReferenceListRecipe.java:197) > - locked <0x00000000e5bab728> (a org.apache.aries.blueprint.container.ReferenceListRecipe$ServiceDispatcher) > at org.apache.aries.blueprint.container.ReferenceListRecipe.untrack(ReferenceListRecipe.java:147) > - locked <0x00000000eaef82d0> (a java.lang.Object) > at org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe.serviceRemoved(AbstractServiceReferenceRecipe.java:370) > at org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe.access$200(AbstractServiceReferenceRecipe.java:72) > at org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe$3.run(AbstractServiceReferenceRecipe.java:324) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at org.apache.aries.blueprint.container.ExecutorServiceWrapper.run(ExecutorServiceWrapper.java:106) > at org.apache.aries.blueprint.utils.threading.impl.DiscardableRunnable.run(DiscardableRunnable.java:48) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Found 1 deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira