felix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierre De Rop <pierre.de...@gmail.com>
Subject Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)
Date Thu, 14 May 2015 17:54:08 GMT
Thanks David; I just gave a try, and indeed the parallel test passed. I
observed a gain of around 7/10%. The tool is described in [1].

But I only have 4 cores on my laptop and I will make more tests in my lab
at work (next week) where we have some servers having 32 or even 128
processors. This will give a better idea of the gain because the more
processor you have, the more synchronization is costly, so I could possibly
observe a better performance gain.

Now, I'm sorry but I think that there is still a problem (I don't know
where): when using more threads, the parallel test does not complete and
stops with a timeout message, indicating that the number of expected
components are not created after a timeout delay of 1 minute.

So, I just committed a modified version of the tool in the sandbox which
can now take a -Dthreads option in order to configure the number of
threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
not complete and ends with a timeout:

$ java -Dthreads=10 -server -jar bin/felix.jar

g! Starting benchmarks (each tested bundle will add/remove 630 components
during bundle activation).

        [Starting benchmarks with no processing done in components start
methods]

Benchmarking bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
.................................................Could not start components
timely: current start latch=2, stop latch=630

My current understanding of this is that some components are still awaiting
for unsatisfied service dependencies, just like if a service tracker would
have missed a service registration.

I ran the same test during two hours with the previous framework version,
and did not observe any problems.

I wonder if someone else do have another tool in order to perform another
kind of load test, just to see if some problems are also observed.

-> from  my side, I will do the following: in the past, the benchmark tool
supported not only dependencymanager, but also Felix SCR and iPojo. So, I
will reintroduce Felix SCR in the benchmark and will check if I also
observe the problem (with -Dthreads=10).

I will let you know.

cheers;
/Pierre

[1]
http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README

On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> I've fixed this now in
> svn.apache.org/viewvc?view=revision&revision=1679367
>
> Pierre, your loadtest now runs to completion - thanks for reporting
> this issue! I can see that the results for the parallel tests are a
> little bit different than before, but I'm not sure how to read them so
> I'll leave the interpretation of that to you :)
>
> Cheers,
>
> David
>
> On 14 May 2015 at 14:38, David Bosschaert <david.bosschaert@gmail.com>
> wrote:
> > I think I know what this is. I had some additional changes exactly in
> > this area that I simply forgot to apply this morning. I should have it
> > fixed sometime today.
> >
> > Cheers,
> >
> > David
> >
> > On 14 May 2015 at 14:03, David Bosschaert <david.bosschaert@gmail.com>
> wrote:
> >> Hi Pierre,
> >>
> >> I'll take a look today.
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On 14 May 2015 at 14:00, Pierre De Rop <pierre.derop@gmail.com> wrote:
> >>> I just committed the benchmark tool in
> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
> can
> >>> take a look.
> >>>
> >>> To run the scenario:
> >>>
> >>> - install jdk8:
> >>>
> >>> [nxuser@nx0012 pderop]$ java -version
> >>> java version "1.8.0_40"
> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
> >>>
> >>> - checkout the loadtest from
> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >>>
> >>> - go the the "loadtest" directory and start the test, just like this:
> >>>
> >>> $ java -server -jar bin/felix.jar
> >>> Welcome to Apache Felix Gogo
> >>>
> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> components
> >>> during bundle activation).
> >>>
> >>>         [Starting benchmarks with no processing done in components
> start
> >>> methods]
> >>>
> >>> Benchmarking bundle:
> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >>> ..................................................
> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
> 319,631,722
> >>> | 919,838,078]
> >>>
> >>> Benchmarking bundle:
> >>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
> >>>
> >>>
> >>> Here, the first
> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
> >>> (single-threaded) passes OK. But the next one hangs
> >>>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >>> it uses a fork join pool with size=4.
> >>>
> >>> and when typing "log warn", we see:
> >>>
> >>> "log warn"
> >>>
> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >>> java.util.ConcurrentModificationException
> >>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >>>         at
> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >>>         at
> >>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >>>         at
> >>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >>>         at
> >>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >>>         at
> >>>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >>>         at
> >>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >>>         at
> >>>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >>>         at
> >>>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >>>         at
> >>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >>>         at
> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >>>         at
> >>>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >>>         at
> >>>
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >>>         at
> >>>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >>>         at
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >>>         at
> >>>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >>>         at
> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >>>         at
> >>>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >>>
> >>>
> >>> (I will investigate also in my code to check if the problem does not
> come
> >>> from me ?)
> >>>
> >>> cheers;
> >>> /Pierre
> >>>
> >>>
> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pierre.derop@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Hi David,
> >>>>
> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
> is a
> >>>> regression somewhere in the framework, by my parallel test does not
> pass
> >>>> anymore.
> >>>>
> >>>> The test first starts with a single-threaded scenario, which passes
OK
> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> then when
> >>>> the parallel test starts
> >>>>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >>>> it suddenly hangs, and when I type "log warn" under the gogo shell,
I
> see
> >>>> the following exception:
> >>>>
> >>>> (I'm using java8):
> >>>>
> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >>>> ____________________________
> >>>> Welcome to Apache Felix Gogo
> >>>>
> >>>> Benchmarking bundle:
> >>>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
> >>>>
> >>>> (here, the dependencymanager.parallel test hangs and when I type "log
> >>>> warn", I see this:)
> >>>>
> >>>> g! log warn
> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >>>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >>>> java.util.ConcurrentModificationException
> >>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >>>>         at
> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >>>>         at
> >>>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >>>>         at
> >>>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >>>>         at
> >>>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >>>>         at
> >>>>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >>>>         at
> >>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >>>>         at
> >>>>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >>>>         at
> >>>>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >>>>         at
> >>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >>>>         at
> >>>>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >>>>         at
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >>>>         at
> >>>>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >>>>         at
> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >>>>         at
> >>>>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >>>>
> >>>> (If I configure my threadpool to 1, I have no problems, but with
> >>>> threadpool=4, then I have the problem)
> >>>>
> >>>> I will investigate, but Ideally, may be it would be helpful if you
> could
> >>>> also run the test by yourself; so I will commit soon something to
> reproduce
> >>>> the problem in my sandbox.
> >>>>
> >>>> cheers;
> >>>> /Pierre
> >>>>
> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >>>> david.bosschaert@gmail.com> wrote:
> >>>>
> >>>>> I've committed this now in
> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >>>>>
> >>>>> Curious to see what others are measuring. My tests were focused
on
> >>>>> multiple bundles/threads obtaining the same service, as that's were
I
> >>>>> saw a bit of contention.
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> David
> >>>>>
> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pierre.derop@gmail.com>
> wrote:
> >>>>> > Hi David,
> >>>>> >
> >>>>> > I'm looking forward to test your improvements using the
> >>>>> dependencymanager
> >>>>> > benchmark tool ([1]).
> >>>>> >
> >>>>> >
> >>>>> > [1]
> >>>>> >
> >>>>>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >>>>> >
> >>>>> > /Pierre
> >>>>> >
> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >>>>> > david.bosschaert@gmail.com> wrote:
> >>>>> >
> >>>>> >> I have implemented the performance improvements that I
was
> thinking of
> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
> >>>>> >>
> >>>>> >> I wrote a little performance test suite [2] that tests
> multithreaded
> >>>>> >> service registry performance (10 threads) from single /
multiple
> >>>>> >> bundles with either singleton services and Prototype Service
> Factory
> >>>>> >> services and the results are quite impressive. I'm getting
> performance
> >>>>> >> improvements compared to the current trunk from 8 times
better
> than
> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >>>>> >>
> >>>>> >> Carsten has already reviewed the code (thanks Carsten!)
and I'm
> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
> >>>>> >>
> >>>>> >> Cheers,
> >>>>> >>
> >>>>> >> David
> >>>>> >>
> >>>>> >> [1]
> >>>>> >>
> >>>>>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >>>>> >> [2]
> >>>>> >>
> >>>>>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >>>>> >>
> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <heavy@ungoverned.org>
> >>>>> wrote:
> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >>>>> >> >>
> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> heavy@ungoverned.org>
> >>>>> >> wrote:
> >>>>> >> >>>
> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >>>>> >> >>>>
> >>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(),
> not
> >>>>> sure
> >>>>> >> if
> >>>>> >> >>>> it
> >>>>> >> >>>> can be the culprit though.
> >>>>> >> >>>> Interrupts could also be caused by a bundle
being shutdown
> while
> >>>>> one
> >>>>> >> of
> >>>>> >> >>>> its
> >>>>> >> >>>> thread is waiting for a service, which
should is a valid use
> case
> >>>>> >> imho.
> >>>>> >> >>>> Anyway, I think sanely reacting to a thread
being interrupted
> >>>>> would be
> >>>>> >> >>>> good.
> >>>>> >> >>>
> >>>>> >> >>>
> >>>>> >> >>> Yes, threads can be interrupted if they are
holding a bundle
> lock
> >>>>> and
> >>>>> >> the
> >>>>> >> >>> global lock holder needs the bundle lock.
> >>>>> >> >>>
> >>>>> >> >>> I admit that I do not recall why we ignore
the interrupt
> here, but
> >>>>> >> didn't
> >>>>> >> >>> we
> >>>>> >> >>> implement service lookup so that a bundle
lock wasn't
> necessary? I
> >>>>> >> >>> thought
> >>>>> >> >>> we just checked for the validity of the bundle
context before
> >>>>> returning
> >>>>> >> >>> or
> >>>>> >> >>> something. Perhaps we felt there was no reason
to be
> interrupted in
> >>>>> >> that
> >>>>> >> >>> case. I really don't know.
> >>>>> >> >>
> >>>>> >> >> I think that the Service Registry could be rewritten
to be
> >>>>> completely
> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
> libraries,
> >>>>> >> >
> >>>>> >> >
> >>>>> >> > Well, that just moves the sync blocks to the library,
but yeah
> sure.
> >>>>> >> >
> >>>>> >> >> which I think would really be a better approach.
There is too
> much
> >>>>> >> >> locking going on in the current SR implementation
IMHO.
> >>>>> >> >
> >>>>> >> >
> >>>>> >> > I don't really think there is too much, but it is
complicated.
> >>>>> >> > Unfortunately, it is complicated to make sure that
locks aren't
> held
> >>>>> >> while
> >>>>> >> > do service lookups and this is complicated because
you can run
> into
> >>>>> >> cycles,
> >>>>> >> > etc.
> >>>>> >> >
> >>>>> >> > But feel free to try to simplify it.
> >>>>> >> >
> >>>>> >> >>
> >>>>> >> >> This brings the question: can we move to Java
5 (or Java 6)
> for the
> >>>>> >> >> Framework codebase? AFAIK we're currently still
JDK 1.4
> compatible
> >>>>> but
> >>>>> >> >> I would be surprised if there is anyone who still
needs a JDK
> that
> >>>>> >> >> went end-of-life 7 years ago.
> >>>>> >> >
> >>>>> >> >
> >>>>> >> > At this point, it doesn't really matter to me.
> >>>>> >> >
> >>>>> >> > -> richard
> >>>>> >> >
> >>>>> >> >>
> >>>>> >> >> Best regards,
> >>>>> >> >>
> >>>>> >> >> David
> >>>>> >> >
> >>>>> >> >
> >>>>> >>
> >>>>>
> >>>>
> >>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message