felix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Pauls" <karlpa...@gmail.com>
Subject Re: Deadlock in Felix
Date Sat, 02 Feb 2008 00:20:32 GMT
On Jan 30, 2008 11:31 AM, Stuart McCulloch <stuart.mcculloch@jayway.net> wrote:
> On 30/01/2008, Felix Meschberger <fmeschbe@gmail.com> wrote:
> >
> > Hi Niclas,
> >
> > The problem is (a) the generous synchronisation of Log4J and (b) locking
> > used by stuff used for class loading. In our projects we regularly face
> > issues between Log4J and our ClassLoader implementations synchronizing
> > on ClassLoader.loadClass().
> >
> > The deadlock occurrs because both parties - framework and Log4J - lock
> > "big" parts of their code and call to code outside of their scope while
> > being locked: the framework calls the LogService outside of the
> > framework and Log4J calls into class loading outside of Log4J.
> >
> > On solution, I could imagine, is not using Log4J, which may or may not
> > be an option. Maybe SLF4J or Logback could be an option here ? [ In
> > Sling we actually use Logback as a logging backend for our LogService
> > implementation ]
> >
> > Another solution would be to enhance the framework Logger to diable the
> > use of a LogService. E.g. by defining a framework property, which when
> > set, causes the Logger to never use the LogService.
> >
> > Both solutions don't sound right ...
>
>
> other possible solutions:
>
> a)  have a separate thread make the LogService call (fed from a queue)
>      although you'd have to be careful not to introduce other deadlocks

So would it be acceptable to deliver log calls asynchronously? If so I
can probably make that change quickly ...

regards,

Karl

> b)  delay sending log messages from critical sections of the framework
>      ie. log to a buffer, then send the messages when it's safe to do so
>
>
> Regards
> > Felix
> >
> > Am Mittwoch, den 30.01.2008, 13:51 +0800 schrieb Niclas Hedhman:
> > > On Tuesday 29 January 2008 16:55, Karl Pauls wrote:
> > > > Could you have them retry using Felix 1.0.3? This might be related to
> > > > some of the concurrency things we fixed.
> > > >
> > > > In case they can not be bothered retrying on Felix 1.0.3 then maybe
> > > > they can provide a minimal config file that only uses publicly
> > > > available bundles and has this issue (then I can look into it).
> > >
> > > I am looking at the code in the trunk, and it appears that the locking
> > that
> > > triggers this is in place.
> > >
> > > As always, it is a bit tricky to setup threading problems. SO, I would
> > like to
> > > run a "head exercise" first.
> > >
> > > 1. The Starter thread "FelixStartLevel" locks the ModuleFactoryImpl
> > instance
> > > in R4SearchPolicyCore.resolve().
> > >
> > > 2. The Configuration Admin thread "Configuration Updater" calls the
> > LogService
> > > with the Configuration instance, which leads to Log4J locks on its own
> > > RootLogger and  calls ClassLoader.loadClass() on something found in the
> > > configuration. This leads to trying to acquire the ModuleFactoryImpl
> > lock
> > > either in R4SearchPolicyCore.resolve() or in the provided stack trace
> > > R4SearchPolicyCore.getInUseCandidates() due to a ClassNotFound in the
> > > previous step.
> > >
> > > 3. The "FelixStartLevel" thread reaches
> > > m_logger.log(Logger.LOG_DEBUG, "WIRE: " + wires[wireIdx]);
> > > in R4SearchPolicyCore.createWires() and he log() method will try to
> > acquire
> > > the RootLogger lock.
> > >
> > > 4. DEADLOCK.
> > >
> > >
> > > I agree this is very special to the LogService, since Felix binds to it
> > and
> > > uses it for its internal use, and the responsibility is across two
> > different
> > > systems. Suggestions are welcome.
> > >
> > >
> > > Cheers
> >
> >
>
>
> --
> Cheers, Stuart
>



-- 
Karl Pauls
karlpauls@gmail.com

Mime
View raw message