harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Varlamov" <alexey.v.varla...@gmail.com>
Subject Re: [drlvm] Class unloading support - tested one approach
Date Thu, 09 Nov 2006 07:42:59 GMT
Uhm, Etienne overtook me with earlier posts.
Seems we are beginning to converge with design.

2006/11/9, Alexey Varlamov <alexey.v.varlamov@gmail.com>:
> 2006/11/8, Robin Garner <robin.garner@anu.edu.au>:
> > Robin Garner wrote:
> > > Aleksey Ignatenko wrote:
> > >> Robin.
> > >>
> > >>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
> > >>> object instead of a strong one.  When the reference >becomes (strong)ly
> > >>> unreachable, invoke the class-unloading phase.
> > >>
> > >>
> > >> If you have weak reference to j.l.Classloader - GC will collect it
> > >> (with all
> > >> appropriate jlClasses) as soon as there are no references to
> > >> j.l.Classloaderand appropriate classes. But there is possible
> > >> situation when there are some
> > >> live objects of that classes and no references to jlClassloader and
> > >> jlClasses. This will lead to unpredictable consequences (crash, etc).
> > >>
> > >>
> > >>
> > >> I want to remind that there 3 mandatory conditions of class unloading:
> > >>
> > >> 1. j.l.Classloader instance is unreachable.
> > >>
> > >> 2. Appropriate j.l.Class instances are unreachable.
> > >>
> > >> 3. No object of any class loaded by appropriate class loader exists.
> > >
> > > Let me repeat.  I offer an efficient solution to (3).  I don't purport
> > > to have a solution to (1) and (2).
> >
> > Let me just add:  This is because I don't think (1) or (2) are
> > particularly difficult from a performance point of view, although I'm
> > happy to accept that there may still be some subtle engineering challenges.
>
> Robin,
>
> While your idea to (3) looks brilliant and quite convincing, it only
> covers part of the whole mission. We really need to derive complete
> design solution (like Etienne did), and I feel the voting started in
> the neighbor thread is a bit premature.
> Some of considerations below are beyond of my understanding, could you
> please clarify them (inlined)?
>
> And yet, it would be nice to have a confirmation that the notion of
> "epoch of full-heap-collection" does not imply strict limitations on
> GC algorithms. Maybe this is something obvious for people with more
> decent GC background than me?
>
> >
> > Now this is just off the top of my head, but what about this for a design:
> > - A j.l.ClassLoader maintains a collection of each of the classes it has
> > loaded
> > - A j.l.Class contains a pointer to its j.l.ClassLoader
> > - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
> > The point of this is that a class loader and its classes are a 'self
> > sustaining' data structure - if one element in it is reachable the whole
> > thing is reachable.
> Right. The special case is for system classes which are always in VM
> root set so never reclaimed.
>
> > The VM maintains a weak reference to all its j.l.ClassLoader instances,
> > and maintains a ReferenceQueue for weakly-reachable classloaders.
> > ClassLoaders are placed on the ReferenceQueue if and only if they are
> > unreachable from the heap (including via their j.l.Class objects).
> Here: should it actually read as "WeakReference instances for
> weakly-reachable classloaders are placed on the ReferenceQueue"?
> Otherwise this sentence completely escapes my mind, sorry.
> If the former, when how VM could obtain&rescue referent CL objects (+
> it's j.l.Class instances) after GC pass - AFAIU references are cleared
> automatically before enqueuing? I suppose we are not going to
> introduce inter-phase communication between VM and GC...
>
> > Note this is an irreversible condition: objects that are unreachable can
> > never become reachable again, except through very specific methods.
> >
> > When it sweeps the ReferenceQueue for unreachable classloaders, the VM
> > places the unreachable classloaders in a queue of classloaders that are
> > candidates for unloading.  This queue is part of the root set of the VM.
> Strongly referenced now I suppose.
>
> >  A classloader in this queue is unreachable from the heap, and can be
> > unloaded when there are no objects of any class it has loaded.
> So if the VM decides it is time to try unloading, it should:
> 1) Check if the full epoch has passed;
> 2) for each unloadable CL, scan corresponding vtables;
> 3) if none of the vtables were marked reachable, drop the CL from root
> set completely and clean corresponding native structures; Java
> instances will be reclaimed at nearest GC iteration;
> 4) Reset "epoch marker" and vtable words.
>
> Do I get it right?
>
>
> >
> > This is where my mechanism comes into play.
> >
> > If an object executes getClass() then its classloader is removed from
> > the unloadable classloader queue, its weak reference gets recreated  and
> > we're back at the initial state.  My guess is that this is a pretty
> > infrequent method call.
> >
> > I think this stage of the algorithm is easy in performance terms -
> > difficult in terms of proving correctness, but if you have an efficient
> > reachability mechanism for classes I think the building blocks are
> > there, and the subtleties are nothing that a talented engineer can't solve.
>
> Yes, a bit complicated. Taking into account the issues with
> ReferenceQueue above, I'd rather suggest the following:
>
> 1) The j.l.Class and defining CL have mutual strong references, as said above.
> 2) Normally, the VM reports all CLs as strong roots thus preserving
> them from premature reclamation;
> 3) When the VM decides (by whatever heuristic) it is time to perform
> unloading, it checks epoch invariant and scans all vtables for all
> CLs;
> 4) if a CL has no "reachable" vtables, it is moved to
> unloading-candidates collection and reported as a weak root, otherwise
> it remain in the strong root set.
> 5) If the nearest GC clears some of the weak references above, do
> corresponding natives cleanup and return survived CLs to normal root
> set.
> 6) Reset all data: epoch/vtables/etc and return back to 2).
>
> I believe this is less disruptive to component interfaces and requires
> less support on GC side.
>
> >
> >
> > I'm not 100% sure what your counter-proposal is: I recall 2 approaches
> > from the mailing list:
> > 1) Each object has an additional word in its header that points back to
> >    its j.l.Class object, and we proceed from here.
> >
> > Given that the mean object size is ~28 bytes, this proposal adds 14% to
> > each object size.  This increases the frequency of GC by 14% and incurs
> > a 14% slowdown.  Of course this is an oversimplification but a 14%
> > slowdown is a pretty lousy starting point to argue from.
> >
> > 2) The existing pointer in the GC header is traced during GC time.
> >
> > The average number of pointers per object (excluding the vtable) is
> > between 1.5 and 2 for the majority of benchmarks I have looked at
> > (footnote: if you know something different, drop me a line) (geometric
> > mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
> > additional reference per object will therefore increase the cost of GC
> > by ~60% on average.  Again oversimplification but indicative.  If we
> > assume that GC accounts for 10% of runtime (more or less depending on
> > heap size), this is a runtime overhead of 6%.
> Looks reasonable as upper estimation, it would be nice to look at a
> live data though. Aleksey?
>
> > My proposal has been measured at ~1% overhead in GC time, or 0.1% in
> > execution time (caveats as above).  If there is some complexity in
> > establishing classloader reachability from this basis, I would assume it
> > can easliy be absorbed.
> >
> > Therefore I think my proposal, while not complete, can form the basis of
> > an efficient complete system for class unloading.
>
> Nice thing about "automitic" approach is that it does not imply
> slightest limitation on GC policy and adopts to any future algorithms
> improvements. It's a pity the same wasn't (can't be?) said about the
> voted idea.
> Actually some tuning for the "automitic" approach is possible, like
> keeping all j.l.Class & VT instances in a special space which is
> collected only periodically, so GC does not need to trace VTs all the
> time.
>
> --
> Regards,
> Alexey
>
> >
> > (PS: I'd *love* to be proven wrong)
> >
> > cheers,
> > Robin
> >
> > > Regards,
> > > Robin
> > >
> > >>
> > >>
> > >> Aleksey.
> > >>
> > >>
> > >> On 11/8/06, Robin Garner <robin.garner@anu.edu.au> wrote:
> > >>>
> > >>> Pavel Pervov wrote:
> > >>> > Robin,
> > >>> >
> > >>> > The kind of model I had in mind was along the lines of:
> > >>> >> - VM maintains a linked list (or other collection type) of
the
> > >>> currently
> > >>> >> loaded classloaders, each of which in turn maintains the
> > >>> collection of
> > >>> >> classes loaded by that type.  The sweep of classloaders goes
> > >>> something
> > >>> >> like:
> > >>> >>
> > >>> >> for (ClassLoader cl : classLoaders)
> > >>> >>   for (Class c : cl.classes)
> > >>> >>     cl.reachable |= c.vtable.reachable
> > >>> >
> > >>> >
> > >>> > This is not enough. There are may be live j/l/Class'es and
> > >>> > j/l/Classloader's
> > >>> > in the heap. Even though no objects of any classes loaded by a
> > >>> particual
> > >>> > class loader are available in the heap, if we have live reference
to
> > >>> > j/l/ClassLoader itself, it just can't be unloaded.
> > >>>
> > >>> OK, well how about keeping a weak reference to the j.l.ClassLoader
> > >>> object instead of a strong one.  When the reference becomes (strong)ly
> > >>> unreachable, invoke the class-unloading phase.
> > >>>
> > >>> To me the key issue from a performance POV is the reachability of
> > >>> classes from objects in the heap.  I don't pretend to have an answer
to
> > >>> the other questions---the performance critical one is the one I have
> > >>> addressed, and I accept there may be many solutions to this part of
the
> > >>> question.
> > >>>
> > >>> > I believe that a separate heap trace pass, different from the
standard
> > >>> >> GC, that visited vtables and reachable resources from there
would
> > >>> also
> > >>> >> be a viable solution.  As mentioned in an earlier post, writing
> > >>> this in
> > >>>
> > >>> >> MMTk (where a heap trace operation is a class that you can
easily
> > >>> >> subtype to do this) would be easy.
> > >>> >>
> > >>> >> One of the advantages of my other proposal is that it can
be
> > >>> implemented
> > >>> >> in the VM independent of the GC to some extent.  This additional
> > >>> >> mark/scan phase may or may not be easy to implement, depending
on the
> > >>> >> structure of DRLVM GCs, which is something I haven't explored.
> > >>> >
> > >>> >
> > >>> > DRLVM may work with (potentially) any number of GCs. Designing
class
> > >>> > unloading the way, which would require mark&scan cooperation
from
> > >>> GC, is
> > >>> > not
> > >>> > generally a good idea (from my HPOV).
> > >>>
> > >>> That's what I gathered.  hence my proposal.
> > >>>
> > >>> cheers
> > >>>
> > >>> --
> > >>> Robin Garner
> > >>> Dept. of Computer Science
> > >>> Australian National University
> > >>>
> > >>
> > >
> > >
> >
> >
> > --
> > Robin Garner
> > Dept. of Computer Science
> > Australian National University
> >
>

Mime
View raw message