harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From boot...@earthlink.net
Subject Re: Questions about GC implementations
Date Thu, 01 Jan 1970 00:00:00 GMT


> [Original Message]
> From: Robin Garner <robin.garner@anu.edu.au>
> To: Apache Harmony Bootstrap JVM <bootjvm@earthlink.net>
> Cc: <harmony-dev@incubator.apache.org>
 > Date: 10/21/05 5:04:55 PM
> Subject: Re: Questions about GC implementations
>
> >
> > Robin,
> >
> > Over the last few days as we have all been discussing
> > heap and GC, I have been growing mroe curious by the
> > hour about what you see as a design critique of the GC
> > hooks I defined.  Perhaps I am taking it a bit too simplistically,
> > but all I did was define a GC hook anywhere an object or
> > array reference or a class load or unload happened.  How
> > does your concept differ?  And what sort of approach should
> > be used instead.  GC is a completely new area to me and I
> > am very interested in learning about what could be done.
> > One of these days I'm going to read the standard (Jones?)
> > book on JVM garbage collecition, but not today.
> >
> > Even if you don't have your proposal ready now, I'm still
> > interested in hearing how you approach the foundation of
> > GC design and how you would approach it on this JVM.
> >
> >
> > Dan Lydick
> 
> In the garbage collectors I've worked with, the essential design is:
> 
> - 'new' allocates space on the heap.
> - The header of each object contains a pointer (or equiv) to a per-class
> data structure, that points to the GC map for the object (and virtual
> dispatch tables etc etc)
> - Reference fields in objects contain pointers directly to the heap
> objects they reference.
> - Pointer loads and stores are (optionally) performed via barriers - the
> terminology is a little confusing: these are not synchronization barriers,
> but opportunities for the GC to intercept the load/store and do some
> additional processing.  Write barriers are common, read barriers less so.

This is also the approach I have taken, so I think we are
on the same page.  I think we are just saying the same thing
in different words.  When an object is to be allocated,
its pointer will be set by the allocator.  This pointer
is 'robject.pgarbage' and is found in 'jvm/src/object.h'.
The reason you did not find one is because I have only
provided a stub GC implementation.  The 'new' operation
is performed by object_instance_new() in 'jvm/src/object.c'
and includes a call to GC_OBJECT_NEW().

Any time a reference to that object is made, its object
hash, of type 'jvm_object_hash' has a reference recorded
by GC_OBJECT_MKREF_FROM_OBJECT() or GC_OBJECT_FIELD_MKREF()
or GC_OBJECT_MKREF_FROM_JVM().  These are for internal
JVM references, references from Java object reference variables,
and references from Java local method reference variables,
respectively.  When the reference is no longer needed, such as
when an object is destroyed or when a local method returns,
the reference is no longer needed.  When this occurs,
then GC_OBJECT_RMREF_FROM_OBJECT() or GC_OBJECT_FIELD_RMREF()
or GC_OBJECT_RMREF_FROM_JVM() are called, respectively.  When
an object is no longer used, then GC_OBJECT_DELETE() is called.

When a local methods is called, GC_STACK_NEW() is invoked to set
up GC for that stack frame.  Adding and removing objects is done
per above.  Adding and removing references to objects is done
with GC_STACK_MKREF_FROM_JVM() and GC_STACK_RMREF_FROM_JVM().
When it returns, GC_STACK_DELETE() is called.

> - There are many styles of collector, but the most common class uses
> tracing, in which a root set of pointers is used to determine an initial
> set of live objects, and the collector performs a transitive closure over
> this set to establish the set of all live objects.  The root set is
> commonly the thread stacks and the static pointer fields.

The macro GC_RUN() refers to the collector.  The stub implementation
shows what it is supposed to do.  The OBJECT_STATUS_GCREQ status bit
controls when an object needs collecting.

All of the above setting up and tearing down of objects and references
to objects have equivalents for classes using the same rationale.
(The class GC pointer is 'rclass.pgarbage' in 'jvm/src/class.h'.
The local method GC pointer is 'JVMREG_STACK_GC_OFFSET' in
'jvm/src/jvmreg.h' and is manipulated by GC_STACK_NEW() and
GC_STACK_DELETE().)

> - The above is also complicated by
>   . Reference types
>   . Finalization
>   . Locks
>   . Address-based hashing
>   The solutions to these are all pretty well known, but complicate the
design
> 
> This is pretty much it - the rest (45 years of research) is optimizing the
> way this is all done.
> 
> >           Perhaps I am taking it a bit too simplistically,
> > but all I did was define a GC hook anywhere an object or
> > array reference or a class load or unload happened.  How
> > does your concept differ?  And what sort of approach should
> > be used instead.
> 
> This I think brings us back to my initial question, asking what these
> hooks were supposed to do.  I guess you're saying you had a vague idea
> that the GC might need to know about these events, so put hooks in for
> them.
> 
> When I was looking around the code trying to find out where to start
> hooking in a managed heap, I looked for the 'new' or 'alloc' operation,
> and couldn't seem to find it.
> 

'new' --  see object_instance_new() for objects,
          class_static_new() for classes.

'delete' -- see object_instance_delete() for objects,
            class_static_delete() for classes.

All four of these functions perform a number of GC activities,
espcially calling the GC_xxx() macros listed above.

Notice that, throughout the body of the code, _any_ time a reference
to an object (or class) is created or destroyed, one or more related
GC functions are called.

What is _done_ with these events is up to the implementor
of the GC algorithm itself.

> Interface design for memory managers is an interesting research question. 
> Weldon has posted the ORP interface, which I think is probably pretty
> close to a good design.  I would make some additional operations explicit
> in the interface, and abstract over some of the features that are a
> 'shared understanding' between GC and VM in the ORP design, but I think a
> final design would have a lot in common with it.  The MMTk interface is
> the one I know best, and while it isn't perfect in itself, my improvements
> to the ORP design would probably involve taking features from the MMTk
> interface and adapting them to the environment.
> 
> > be used instead.  GC is a completely new area to me and I
> > am very interested in learning about what could be done.
> > One of these days I'm going to read the standard (Jones?)
> > book on JVM garbage collecition, but not today.
> 
> Yep - Jones and Lins is the standard reference.  If you're looking for a
> good brief introduction, you could do much worse than the wikipedia entry,
> which seems to cover most of the important parts.
> 
> > Even if you don't have your proposal ready now, I'm still
> > interested in hearing how you approach the foundation of
> > GC design and how you would approach it on this JVM.
> 
> I think its important that the initial interface contains all the features
> that have ramifications for the core design ideas of the rest of the
> system.  Hope to have a draft out next week.
> 

I'm looking forward to it!


> cheers,
> Robin
> 

Dan Lydick




Mime
View raw message