felix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard S. Hall (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FELIX-5215) Deadlocks involving global lock
Date Mon, 14 Mar 2016 17:53:34 GMT

    [ https://issues.apache.org/jira/browse/FELIX-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193750#comment-15193750

Richard S. Hall commented on FELIX-5215:

Framework synchronization has been the bane of my existence since 2000...the OSGi spec has
made life very difficult with requiring synchronous event delivery. As a result, I don't think
it is completely possible to eliminate the possibility of deadlocking. Timeouts are one way
of dealing with this reality.

Digging a little deeper, in general, there are two types of operations, those involving a
single bundle and those involving multiple bundles. The main multi-bundle operation is bundle
resolution (which can be triggered during a single-bundle operation). Related, the main reason
why the global lock exists is to make sure the state of the framework doesn't change while
a multi-bundle (i.e., resolve) operation is taking place.

The one thought that I've had over the years was to make the bundle resolution process optimistic
and non-locking. Effectively, just do the resolve process on a copy of the state assuming
the framework state won't change. At the end, check to see if the state used to calculate
the resolution is the same as the current framework state; if so, grab the framework state
lock and apply it; if not, re-resolve with the updated state. Rinse, repeat, as necessary.

I think something like this would go a long way toward improving the situation, but even then
it won't eliminate the possibility. I think Equinox does something similar.

> Deadlocks involving global lock
> -------------------------------
>                 Key: FELIX-5215
>                 URL: https://issues.apache.org/jira/browse/FELIX-5215
>             Project: Felix
>          Issue Type: Improvement
>          Components: Framework
>    Affects Versions: framework-4.6.1, framework-5.4.0
>            Reporter: Julian Sedding
>         Attachments: deadlock-01.txt, deadlock-02.txt
> I have recently analyzed two thread dumps on a framework 4.6.1 with deadlocks involving
the {{FelixFrameworkWiring}} thread calling {{Felix.refreshPackages}} and another thread.
> In both cases the {{FelixFrameworkWiring}} thread holds Felix' global lock in {{Felix.refreshPackages}},
the other thread holds a lock in {{HttpServiceImpl}} and {{ServiceRegistry}}, respectively.
(Note, both {{HttpServiceImpl}} and {{ServiceRegistry}} had their synchronization removed
in trunk, possibly due to similar deadlocks).
> While fixing the other players in the deadlock certainly helps, I was wondering if it
would be possible to change the code inside the framework in a way that such deadlocks are
no longer possible?
> I believe section 4.7.3 "Synchronization Pitfalls" in the OSGi spec talks about this
situation (quoted from v5.0.0):
> {quote}
> Generally, a bundle that calls a listener should not hold any Java monitors. This means
that neither the Framework nor the originator of a synchronous event should be in a monitor
when a callback is initiated.
> [...]
> {quote}

This message was sent by Atlassian JIRA

View raw message