felix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Walker (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FELIX-3174) Startup freeze caused in acquireBundleLock
Date Thu, 20 Oct 2011 08:46:10 GMT

    [ https://issues.apache.org/jira/browse/FELIX-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131453#comment-13131453
] 

Rob Walker commented on FELIX-3174:
-----------------------------------

I can't put my finger on it, but I'm pretty sure there's a sizable timing window in the code
below:

Felix.java, around line 4832:

    void acquireBundleLock(BundleImpl bundle, int desiredStates)
        throws IllegalStateException
    {
        synchronized (m_bundleLock)
        {
            // Wait if the desired bundle is already locked by someone else
            // or if any thread has the global lock, unless the current thread
            // holds the global lock or the bundle lock already.
            while (!bundle.isLockable() ||
                ((m_globalLockThread != null)
                    && (m_globalLockThread != Thread.currentThread())))
            {
                // Check to make sure the bundle is in a desired state.
                // If so, keep waiting. If not, throw an exception.
                if ((desiredStates & bundle.getState()) == 0)
                {
                    throw new IllegalStateException("Bundle in unexpected state.");
                }
                // If the calling thread already owns the global lock, then make
                // sure no other thread is trying to promote a bundle lock to a
                // global lock. If so, interrupt the other thread to avoid deadlock.
                else if (m_globalLockThread == Thread.currentThread()
                    && (bundle.getLockingThread() != null)
                    && m_globalLockWaitersList.contains(bundle.getLockingThread()))
                {
                    bundle.getLockingThread().interrupt();
                }

                try
                {
                    m_bundleLock.wait();
                }
                catch (InterruptedException ex)
                {
                    throw new IllegalStateException("Unable to acquire bundle lock, thread
interrupted.");
                }
            }



By the time we go into m_bundleLock.wait() - any of the earlier conditions could have changed
e.g. the bundle may not be lockable. 

I suspect that may actually be the case which is happing, but will add some trace code to
try and nail it, but for certain, the code is going into a wait on a lock that never gets
notified which implies the state which cause the lock has changed by the time we enter the
wait.

My issue here is however many tests we put in ahead of the wait, if they aren't within a sync
on an appropriate lock all we can do is narrow the timing window - since the bundle.lock()
and isLockable() are being protected by a method sync lock, which has been release by the
time they return and hence the condition may have changed.
                
> Startup freeze caused in acquireBundleLock
> ------------------------------------------
>
>                 Key: FELIX-3174
>                 URL: https://issues.apache.org/jira/browse/FELIX-3174
>             Project: Felix
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: framework-4.2.0
>            Reporter: Rob Walker
>
> This may be a sub or related case of a few others which I've linked below.
> In the latest trunk we are now seeing a startup scenario where our HTTP bundle acquires
a lock in the process of registering a service, but the later wait for this lock (Felix.java:4862)
never seems to get notified.
> It doesn't seem a traditional deadlock per se - no other thread is holding this lock.
It just seems that the lock never gets notified, hence the HTTP bundle never completes it's
startup and service registration, causing all our other bundles that depend on the HTTP service
never to start up.
> Stack trace of locked thread below:
> ====
> "Jetty HTTP Service" daemon prio=6 tid=0x0586ac00 nid=0x19dc in Object.wait() [0x05a8f000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x1f84df50> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:485)
>         at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4862)
>         - locked <0x1f84df50> (a [Ljava.lang.Object;)
>         at org.apache.felix.framework.Felix.registerService(Felix.java:3205)
>         at org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:346)
>         at org.apache.felix.servicebinder.InstanceManager.requestRegistration(InstanceManager.java:508)
>         at org.apache.felix.servicebinder.InstanceManager.validate(InstanceManager.java:294)
>         - locked <0x1fa2ef78> (a org.apache.felix.servicebinder.InstanceManager)
>         at org.apache.felix.servicebinder.InstanceManager$DependencyManager.serviceChanged(InstanceManager.java:948)
>         - locked <0x1fa2ef78> (a org.apache.felix.servicebinder.InstanceManager)
>         at org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:932)
>         at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793)
>         at org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543)
>         at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4252)
>         at org.apache.felix.framework.Felix.registerService(Felix.java:3275)
>         at org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:346)
>         at org.apache.felix.http.base.internal.HttpServiceController.register(HttpServiceController.java:135)
>         at org.apache.felix.http.base.internal.DispatcherServlet.init(DispatcherServlet.java:48)
>         at org.mortbay.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:440)
>         at org.mortbay.jetty.servlet.ServletHolder.doStart(ServletHolder.java:263)
>         at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>         - locked <0x1fa2f0b0> (a java.lang.Object)
>         at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:736)
>         at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
>         at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
>         at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>         - locked <0x1fa2f1c0> (a java.lang.Object)
>         at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
>         at org.mortbay.jetty.Server.doStart(Server.java:224)
>         at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>         - locked <0x1fa03e50> (a java.lang.Object)
>         at org.apache.felix.http.jetty.internal.JettyService.initializeJetty(JettyService.java:181)
>         at org.apache.felix.http.jetty.internal.JettyService.startJetty(JettyService.java:116)
>         at org.apache.felix.http.jetty.internal.JettyService.run(JettyService.java:307)
>         at java.lang.Thread.run(Thread.java:619)
> ====

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message