harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Garner <robin.gar...@anu.edu.au>
Subject Re: [drlvm] Class unloading support - tested one approach
Date Thu, 02 Nov 2006 03:05:31 GMT
Rana Dasgupta wrote:
> Robin,
>    The basic difference of this with Etienne's method is that the flag is
> on the vtable, instead of the CL instance. Do I understand correctly ? The
> GC perf impact is therefore reduced because you need to lookup
> object->vtable instead of object->class->CLinstancewhen tracing the heap.
> The space overhead is correspondingly slightly higher. Also the GC impact
> may look lower because there are a couple of pseudo GC cycles to reset the
> vtables and sweep the vtables.
> Thanks,
> Rana

The relevant part of Etienne's design I believe is this:

> 7- Each class loader structure maintains a set of boolean flags, one
>  flag per "non-nursery" garbage collected area (even when thread-local
>  heaps are used).  The flag is set when an instance of a class loaded by
>  this class leader is moved into the related GC-area.  The flag is unset
>  when the GC-area is emptied, or (optionally) when it can be determined
>  that no instance of a class loaded by this class loader remains in the
>  GC-area.  This is best implemented as follows: a) use an unconditional
>  write of "true" in the flag every time an object is moved into the
>  GC-area by the garbage collector, b) unset the related flag in "all"
>  class loader structures just before collecting a GC-area, then setting
>  the flag back when an object survives in the area.

My design differs in several key ways from this:
1. There is no requirement for a per non-nursery area flag
2. The mark byte/word is set unconditionally whenever an object is 
visited by the GC, not when an object is moved into a particular mature 
space.  This may be the same for some GCs, but not all.
3. The mark byte/word is an unconditional write - Etienne's proposal 
would use a load/mask/write sequence.  This is performance critical.
4. My memory of x86 assembler is a little rusty, but I believe a 
constant store can be done without requiring a register for the value to 
be written, where as or-ing a bit value into a word requires a temporary 
register or two.
5. In a parallel GC, setting bits in a mask requires a synchronized 
update.  My design doesn't.

The point is an unconditional store to a structure you are already 
accessing is very cheap, whereas register spills, loads and synchronized 
updates are expensive.


> On 10/31/06, Robin Garner <robin.garner@anu.edu.au> wrote:
>> Actually, just thinking about how I would implement this in JikesRVM, I
>> would use the reachability based algorithm, but piggyback on the
>> existing GC mechanisms:
>> - Allocate a byte (or word) in each vtable for the purpose of tracking
>> class reachability.
>> - Periodically, at a time when no GC is running (even the most
>> aggressive concurrent GC algorithms have these, I believe), zero this
>> flag across all vtables.  This is the beginning of a class-unloading
>> epoch.
>> - During each GC, when the GC is fetching the GC map for an object,
>> *unconditionally* write a value to the class reachability byte.  It may
>> make sense for this byte to be in either the first cache-line of the
>> vtable, or the cache line that points to the GC map - just make sure the
>> mark operation doesn't in general fetch an additional cache line.
>> - At a point in the sufficiently far future, when all reachable objects
>> are known to have been traced by the GC, sweep the vtables and check the
>> reachability of the classloaders.
>> The features of this approach are:
>> - Minimal additional work at GC time.  The additional write will cause
>> some additional memory traffic, but a) it's to memory that is already
>> guaranteed to be in L1 cache, and b) it's an unconditional independent
>> write, and c) multiple writes to common classes will be absorbed by the
>> write buffer.
>> - Space cost of at most 1 word per vtable.
>> - This works whether vtables are objects or VM structures
>> - If the relationship between a class and a vtable is not 1:1, this only
>> need affect the periodic sweep process, which should be infrequent and
>> small compared to a GC.
>> - shouldn't need a stop-the-world at any point.
>> I've implemented and tested the GC-relevant part of this in JikesRVM,
>> and the GC time overhead appears to be just under 1% in the MMTk
>> MarkSweep collector.
>> cheers,
>> Robin

View raw message