Mailing-List: contact dev-help@felix.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@felix.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAMit8Sqz6+ME0qYYdHzeMaWh+U5bWVRxFFuiNXPiw9mmjgW4Pg@mail.gmail.com>
References: 
 <CAA66TpoS74VSZUQc9LcjtJGR9Pxb21aSPEaWeUfwuw3nZT7hJQ@mail.gmail.com>
 <558D5674.7070209@nanthrax.net>
 <CAMit8Sqz6+ME0qYYdHzeMaWh+U5bWVRxFFuiNXPiw9mmjgW4Pg@mail.gmail.com>
From: Guillaume Nodet <gnodet@apache.org>
Date: Fri, 26 Jun 2015 17:41:13 +0200
Message-ID: 
 <CAA66Tpq9FtAYQ9M6LaBLTRU5sMDswLk8QNw0J+S-AB84Ao3LRw@mail.gmail.com>
Subject: Re: [HEADS UP] Resolver optimizations
To: dev@felix.apache.org
Content-Type: multipart/alternative; boundary=20cf303f64ca0b190405196d93b1

--20cf303f64ca0b190405196d93b1
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Yeah, I agree this is a valid concern.

>From a performance point of view, we'd have to gather a few different big
test cases, but the previous changes lead to a factor 3 improvements. In
this very specific case (the BigResolutionTest), the improvement for
depth-first was an additional factory of 30.
I've done some testing with Karaf and everything seems to work well.  In
particular, installing features really looks much faster.

I think this is a big mitigated by the fact that the resolver does not try
random changes.  Usually, when a permutation is selected, it's because it
causes a conflict, so the permutation is usually a good idea.   DFS could
be a problem if there was no heuristic in place in the resolver, but given
the resolver only selects permutations that will improve the result, I'm
not sure it's really a problem.

Anyway, I've just compute the difference between the new and the old
resolver for the BigResolutionTest.
The wiring contains a few differences (this is the difference between the
new wiring minus the old wiring) :

osgi.identity; {osgi.identity=3Dio.hawt.hawtio-karaf-terminal,
type=3Dosgi.bundle, version=3D1.4.21}
Removed: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Djline)(version>=3D2.11.0)(!(version>=3D3=
.0.0)))}
-> osgi.wiring.package;
{bundle-symbolic-name=3Dorg.apache.karaf.shell.console, bundle-version=3D2.=
4.0,
osgi.wiring.package=3Djline, version=3D2.12.0} [osgi.identity;
{osgi.identity=3Dorg.apache.karaf.shell.console, type=3Dosgi.bundle,
version=3D2.4.0}]
Removed: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Dorg.apache.karaf.shell.console.jline)(ve=
rsion>=3D2.2.0)(!(version>=3D4.0.0))),
resolution=3Doptional} -> osgi.wiring.package;
{bundle-symbolic-name=3Dorg.apache.karaf.shell.console, bundle-version=3D2.=
4.0,
osgi.wiring.package=3Dorg.apache.karaf.shell.console.jline, version=3D2.4.0=
}
[osgi.identity; {osgi.identity=3Dorg.apache.karaf.shell.console,
type=3Dosgi.bundle, version=3D2.4.0}]
Added: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Djline)(version>=3D2.11.0)(!(version>=3D3=
.0.0)))}
-> osgi.wiring.package; {bundle-symbolic-name=3Djline, bundle-version=3D2.1=
2.0,
osgi.wiring.package=3Djline, version=3D2.12.0} [osgi.identity;
{osgi.identity=3Djline, type=3Dosgi.bundle, version=3D2.12.0}]
osgi.identity; {osgi.identity=3Dio.fabric8.fabric-core, type=3Dosgi.bundle,
version=3D1.2.0.SNAPSHOT}
Added: osgi.service;
{filter=3D(objectClass=3Dio.fabric8.service.child.ProcessControllerFactory)=
,
effective=3Dactive, resolution=3Doptional} -> osgi.service;
{objectClass=3Dio.fabric8.service.child.ProcessControllerFactory,
monitorPollTime=3D1500} [osgi.identity;
{osgi.identity=3Dio.fabric8.fabric-process-container, type=3Dosgi.bundle,
version=3D1.2.0.SNAPSHOT}]
osgi.identity; {osgi.identity=3Dbiz.aQute.bndlib, type=3Dosgi.bundle,
version=3D2.1.0.20130426-122213}
Added: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3DaQute.bnd.annotation.metatype)(version>=
=3D1.43.0)(!(version>=3D2.0.0)))}
-> osgi.wiring.package; {bundle-symbolic-name=3Dbiz.aQute.bndlib,
bundle-version=3D2.2.0.20130927-173417,
osgi.wiring.package=3DaQute.bnd.annotation.metatype, version=3D1.44.0}
[osgi.identity; {osgi.identity=3Dbiz.aQute.bndlib, type=3Dosgi.bundle,
version=3D2.2.0.20130927-173417}]
Added: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Dorg.osgi.resource)(version>=3D1.0.0)(!(v=
ersion>=3D2.0.0))),
resolution=3Doptional} -> osgi.wiring.package;
{bundle-symbolic-name=3D[Ljava.lang.String;@36f8e32d, bundle-version=3D4.4.=
1,
osgi.wiring.package=3Dorg.osgi.resource, version=3D1.0.0} [osgi.identity;
{osgi.identity=3Dorg.apache.felix.framework, description=3DThis bundle is
system specific; it implements various system services., type=3Dosgi.bundle=
,
version=3D4.4.1}]
osgi.identity; {osgi.identity=3Dio.fabric8.fabric-project-deployer,
type=3Dosgi.bundle, version=3D1.2.0.SNAPSHOT}
Removed: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Djline)(version>=3D2.12.0)(!(version>=3D3=
.0.0)))}
-> osgi.wiring.package;
{bundle-symbolic-name=3Dorg.apache.karaf.shell.console, bundle-version=3D2.=
4.0,
osgi.wiring.package=3Djline, version=3D2.12.0} [osgi.identity;
{osgi.identity=3Dorg.apache.karaf.shell.console, type=3Dosgi.bundle,
version=3D2.4.0}]
Removed: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Djline.console)(version>=3D2.12.0)(!(vers=
ion>=3D3.0.0)))}
-> osgi.wiring.package;
{bundle-symbolic-name=3Dorg.apache.karaf.shell.console, bundle-version=3D2.=
4.0,
osgi.wiring.package=3Djline.console, version=3D2.12.0} [osgi.identity;
{osgi.identity=3Dorg.apache.karaf.shell.console, type=3Dosgi.bundle,
version=3D2.4.0}]
Added: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Djline.console)(version>=3D2.12.0)(!(vers=
ion>=3D3.0.0)))}
-> osgi.wiring.package; {bundle-symbolic-name=3Djline, bundle-version=3D2.1=
2.0,
osgi.wiring.package=3Djline.console, version=3D2.12.0} [osgi.identity;
{osgi.identity=3Djline, type=3Dosgi.bundle, version=3D2.12.0}]
Added: osgi.wiring.package;
{filter=3D(&(osgi.wiring.package=3Djline)(version>=3D2.12.0)(!(version>=3D3=
.0.0)))}
-> osgi.wiring.package; {bundle-symbolic-name=3Djline, bundle-version=3D2.1=
2.0,
osgi.wiring.package=3Djline, version=3D2.12.0} [osgi.identity;
{osgi.identity=3Djline, type=3Dosgi.bundle, version=3D2.12.0}]
osgi.identity; {osgi.identity=3Dorg.apache.cxf.cxf-rt-databinding-jaxb,
type=3Dosgi.bundle, version=3D2.7.11}
Added: osgi.wiring.package;
{filter=3D(osgi.wiring.package=3Dcom.sun.tools.xjc.reader.internalizer),
resolution=3Doptional} -> osgi.wiring.package;
{bundle-symbolic-name=3Dorg.apache.servicemix.bundles.jaxb-xjc,
bundle-version=3D2.2.1.1_2,
osgi.wiring.package=3Dcom.sun.tools.xjc.reader.internalizer, version=3D0.0.=
0}
[osgi.identity; {osgi.identity=3Dorg.apache.servicemix.bundles.jaxb-xjc,
type=3Dosgi.bundle, version=3D2.2.1.1_2}]

The results are consistent for both resolvers, so the diff is consistent
too.

The point is that the resolver does not guarantee to provide the "best"
solution, especially for optional requirements.  Neither the old or the new
resolver explores all the possible wirings, it just finds the first one
that solve all the mandatory requirements.   Depending on the choices made
to lead to that solution, the number of optional requirements that are
satisfied may vary.


2015-06-26 16:15 GMT+02:00 David Bosschaert <david.bosschaert@gmail.com>:

> Looks like a great piece of work, Guillaume. I left some minor
> thoughts in a few places on github.
>
> One potential concern is that the change to depth-first search will
> produce different results than the previous approach. Does it still
> aim to minimise the number of class-spaces?
>
> This might cause different results for bundles that import packages
> with underspecified version ranges. On the other hand, that might then
> be a good reason for people to ensure correct import ranges.
>
> Cheers,
>
> David
>
> On 26 June 2015 at 14:41, Jean-Baptiste Onofr=C3=A9 <jb@nanthrax.net> wro=
te:
> > Hi Guillaume,
> >
> > What a work you did !!
> >
> > It looks really great !
> >
> > I cloned your repo to test and take a deeper look.
> >
> > Thanks !
> > Regards
> > JB
> >
> >
> > On 06/26/2015 03:06 PM, Guillaume Nodet wrote:
> >>
> >> I've spent the last two weeks working on optimising the resolver.
> >> The results are available in the following github fork
> >>     https://github.com/gnodet/felix/commits/FELIX-4942
> >>
> >> The commits are split so that they can be reviewed more easily.
> >> Most of them do not really change the algorithm in any way, so I won't
> >> comment them, but I want to highlight a bit those that introduce more
> >> important changes.
> >>
> >> * Avoid throwing exceptions during resolution
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/fa012fe9d3760f6b80c03eecf0b7f03218=
ceae6f
> >>    This commit contains two major ideas: try/catch are slow and the
> >> computation of the resolution messages are usually expensive and
> useless.
> >>    So instead of exceptions, the resolver creates a small object
> >> containing
> >> all the information needed to create the exceptions later if needed
> (this
> >> usually happen only once at the end).
> >> When the resolver is doing some recursive work, it usually
> >>
> >> * Compute all package sources for a given resource at the same time
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/a602f43ccd76983026d6c0c76958bc3445=
c0f4c0
> >>    Time can be saved by computing all the package sources for a given
> >> resource at the same time, especially when a resource exports the same
> >> package with multiple versions (the computation is still recursive, bu=
t
> >> this will be removed in a following commit).
> >>
> >> * Compute package spaces by stages
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/71cd667e89d41d580c8f2b88605694cc36=
1bf134
> >>   Instead of doing things lazily, most of the computation can be done =
by
> >> stages.
> >> So we first compute the list of wire candidates, then the exported
> >> packages, then the imported and required package lists, then the packa=
ge
> >> sources and last, the use constraints.
> >>
> >>   * Parallelism
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/b1e88923fb9a2f056f3b6b4bc96682d620=
652fd5
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/68650153aa6bdde96b4c357ee8faa88fa4=
e90f6e
> >> Following the previous point, things can now be computed in parallel,
> >> given
> >> they are only using read-only objects from a previous stage and the
> >> computation are done for a given resource with no recursion.  This all=
ow
> >> using multiple threads and start the work for each resource, gather
> >> everything at the end of a stage, start the next stage.
> >> I haven't really investigated parallelism in
> checkPackageSpaceConsistency,
> >> and this is always done sequentially, however, most of the work is
> >> computing the package spaces (and creating Candidates copies).
> >> I've also experimented a bit with starting resolving several Candidate=
s
> in
> >> parallel, but with really poor success (usually, I end up with OOM
> >> errors).
> >>
> >>   * Remove recursion in Candidates.populate()
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/3a82b2342c87450edcd1aa6fa5c514c2b0=
4ae346
> >> The algorithm is exactly the same, but we use a list instead of using
> >> recursion.  This is faster and do not impose memory constraints (some
> >> platforms have a small stack by default).
> >>
> >>   * Use depth-first algorithm for permutations
> >>
> >>
> >>
> https://github.com/gnodet/felix/commit/e4f479b5ec2bf4351f755909aa33ea5f0c=
6af2dc
> >> When new permutations are computed, they were always added at the end =
of
> >> one of the lists.  This lead to a width breadth first algorithm.
> However,
> >> the algorithm is smart enough to find the causes of an error, create a
> >> substitution based on this error which should go one step in the right
> >> direction.  By adding the new substitutions to the beginning of the
> lists,
> >> we always go in the right direction, leading to much fewer iterations.
> >>
> >>
> >> Overall results
> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >> On my computer, the BigResolutionTest takes roughly 11s (after the
> removal
> >> of GenericCapabiliy#equals method).
> >> With the head of this branch, the time is down to 100 ms
> >>
> >> I'm planning to test the new resolver a bit more in the following days=
,
> >> making sure I don't see any weird behaviour, but the tests looks reall=
y
> >> good so far.
> >>
> >> All comments welcome !
> >>
> >> Cheers,
> >> Guillaume Nodet
> >>
> >
> > --
> > Jean-Baptiste Onofr=C3=A9
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>

--20cf303f64ca0b190405196d93b1--