harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Zhou <simon.harm...@gmail.com>
Subject Re: [GSoC] Proposal of GC-1: implement WeakReference support in Harmony concurrent GC
Date Wed, 15 Apr 2009 05:27:56 GMT
Thank you Xiao-feng It is very helpful!

I am thinking of take 1~5 as my step one.

In step 2, I will implement a STW version of finalizer processing.
IMHO, maybe I need implement a new interface of VMCore for suspending all
java threads, now it only has a interface for suspending and enumerating
rootset, but I do not need rootset in this case, maybe a new interface just
for suspending is more efficient, I guess.
then I will do some experimental study on the finalization to get the impact
on the pause time.

Thanks
Simon

2009/4/14 Xiao-Feng Li <xiaofeng.li@gmail.com>

> Simon, here is my understanding.
>
> GC has following things with weakref in SATB:
> 1. to stop tracing when meeting a weakref object, and remember it in a
> GC-weakref-queue for later processing;
> 2. to catch the referent of a weakref get(), and remember it as a
> dirty object (or a normal reachable object);
> 3. During marking, the remembered dirty objects should be traced as roots.
>
> 4. When marking finishes, GC need to scan the GC-weakref-queue to find
> unreachable referents, etc. Those weakref's referent field is cleared
> and passed to VM-weakref-queue for further processing. (VM processes
> them by executing enqueue() method of them.)
> 5. to sweep.
>
> Step 4 must be executed after marking finishes, but step 4 can be
> executed concurrently with mutators, and the write barrier can be
> turned off.
>
> Note, the above doesn't include finalizer processing. Finalization
> process needs to trace the finalizable objects as normal scanning,
> which might have race condition with mutators. So write barrier should
> be kept for it.
>
> So in my opinion, step 4 can be conducted in STW minor with finalizer
> processing. Then it will be very similar to what Harmony GC already
> has.
>
> For step 1, check scan_weak_reference()'s call sites and implementation;
> For step 4, check collector_identify_finref()'s call sites and
> implementation.
>
> Thanks,
> xiaofeng
>
> On Tue, Apr 14, 2009 at 10:33 AM, Simon Zhou <simon.harmony@gmail.com>
> wrote:
> > After some investigations, I am wondering when to processing WeakRef for
> > concurrent GC. my understanding is as follows
> >
> > IMHO, we can process them, also concurrently with mutators, when the
> > concurrent marking finishes (for 2 SATB alogorithms). the advantage is
> that,
> > the data structure is easy to implement, I guess.
> >
> > Another choice is to process them along with the marking phase, that is,
> we
> > should construct sync queues for 3 kinds of reference object, the read
> > barrier will enqueue and GC threads will dequeue. The WeakRef processing
> > will finishes with the concurrent marking phase.
> > For mostly concurrent algorithm, we can just process it in the short STW
> > phase.
> > Would you like to give some comments on this? Thank you very much!
> >
> > Thanks
> > Simon
> >
> > 2009/4/1 Xiao-Feng Li <xiaofeng.li@gmail.com>
> >
> >> Simon, I've got some time reviewing your proposal. It looks very
> >> feasible. Probably you'd better read my blog entry on WeakRef and
> >> Finalizer processing of Harmony [1].
> >>
> >> One question, will you implement weakref support for all the three
> >> concurrent GC algorithm?
> >>
> >> Btw, please go ahead to submit/update your proposal to the GSoC
> >> application web.
> >>
> >> [1]
> >>
> http://xiao-feng.blogspot.com/2007/05/weak-reference-processing-in-apache.html
> >>
> >> Thanks,
> >> xiaofeng
> >>
> >> On Tue, Mar 31, 2009 at 6:06 PM, Simon Zhou <simon.harmony@gmail.com>
> >> wrote:
> >>  > Hi All,
> >> > Here is my proposal, any suggestion is welcome!
> >> >
> >> > *Title/Summary:* GC-1: Implement WeakReference support in Harmony
> >> Concurent
> >> > GC
> >> >
> >> > *Student:* Simon.Zhou
> >> >
> >> > *Student e-mail:* simon.harmony@gmail.com
> >> >
> >> > *Student Major:* Computer Science
> >> >
> >> > *Student Degree:* Master
> >> >
> >> > *Student Graduation:* 2009.7
> >> >
> >> > *Organization:* Fudan University
> >> >
> >> > *Assigned Mentor:*
> >> >
> >> > *Abstract:*
> >> >
> >> > weak reference allows a program to maintain special reference to
> objects,
> >> > which is a useful API for the program interacting with the garbage
> >> > collector to some extent. Apache Harmony has a concurrent garbage
> >> collector
> >> > (code name: Tick) which can perform the garbage collection without
> >> > suspending the application threads except for root set enumeration.
> Tick
> >> do
> >> > not have weak reference supporting now, it treats all the references
> >> object
> >> > just as  strong references, which may cause inefficiency for
> applications
> >> > using weak references. In this project, I will add the weak reference
> >> > support for Tick, which would be different from the implementation in
> >> > generational GC, because the consistency should be maintained for the
> >> heap
> >> > objects carefully. Read barrier of the get() method of reference
> object
> >> will
> >> > be used for this implementation, and the performance issue will be
> >> seriously
> >> > considered. before implement phantom reference, I will also implement
> the
> >> > finalizer feature for Tick, although it is not recommanded to be used
> >> except
> >> > it is necessary. Besides this, I am trying to reduce the floating
> garbage
> >> > for Tick’s concurrent algorithms.
> >> >
> >> > *Detailed Description:*
> >> >
> >> > Tick is a concurrent GC of Harmony DRLVM, it can do the garbage
> >> collection
> >> > without suspending application threads. For concurrent GC, the most
> >> > important issue that should be dealt with is the inconsistency
> problem,
> >> that
> >> > is if we trace the heap object graph while the mutator updating the
> same
> >> > graph, inconsistency problem occurs, some live objects may be mistaken
> >> > relaimed as garbage. In order to keep the consistency of objects,
> write
> >> > barriers is inserted when concurrent tracing starts, which intercept
> all
> >> the
> >> > write operations of object reference and record useful data for
> >> rescaning.
> >> > Now there are 3 different concurrent collection algorithms in Tick:
> DLG,
> >> > Sliding-View, Mostly-Concurrent; and 3 different write barriers are
> used
> >> > repectively. see more information [1].
> >> >
> >> > Experiment result shows that Tick can significantly reduce or even
> >> eliminate
> >> > the pause time caused by garbage collection. However, one drawback of
> >> Tick
> >> > is that, it do not support weak reference mechanism, which is a
> important
> >> > API of Java programming language.
> >> >
> >> > Weak reference is an useful tool for whom are much care about the
> >> > application memory usage, e.g. it can help developers to cache some
> >> > temporary large objects for later reusing; also, to do some cleanup
> >> > operations just when a object is acturally relaimed. There are 3 kinds
> of
> >> > weak reference class in Java language (the weakness is decreasingly):
> >> soft
> >> > reference, weak reference, phantom reference. Soft reference is
> usually
> >> used
> >> > to maintain better cache for large object, such as pictures load from
> >> web;
> >> > weak reference can help to reduce the frequency of creating short-live
> >> > objects; phantom reference can usully help programmer to do cleanup
> >> > operations just when an object is physically relaimed. (from now on, I
> >> will
> >> > use ‘weakref’ to represent the set of all the three kinds of weak
> >> > references) All the three kinds of reference classes have a get()
> method,
> >> if
> >> > the object has not been actually relaimed, the get() method will
> return a
> >> > reference of the referent object (referent object is the object which
> >> > reference object point to) except phantom reference (phantom
> reference’s
> >> > get() method always return null, since it is just used for cleanup
> >> > operations when object is actually relaimed). see more information
> [2,3]
> >> >
> >> > Weakrefs are always processed in GC. For generational GC, weakref
> feature
> >> > can be implemented easily: when minor collection occurs, weak
> reference
> >> > objects are equeued and their referents are reclaimed. when major
> >> collection
> >> > occurs, soft reference objects are enqueued and their referents are
> >> > reclaimed. when finalizer are processed, phantom reference objects are
> >> > enqueued and their referents are reclaimed.
> >> >
> >> > Unlike generational GC, Tick just simply treats all the references in
> the
> >> > heap as strong references, which make memory pressure become bigger
> for
> >> the
> >> > applications using much data structure based on weakrefs, such as
> >> > WeakHashMap. In this project, my job is to implement this weakref
> feature
> >> > for Tick, and this work should be independent to the concurrent
> >> algorithms
> >> > used in Tick.
> >> >
> >> > For concurrent GC, weakref processing is different with generational
> GC,
> >> > because the tracing thread is running concurrently with application
> >> threads.
> >> > Application threads may invoke the get() method to get the reference
> of
> >> > referent object, it may:
> >> > 1, put it to the thread local stack as local var or calling parameter,
> >> such
> >> > as
> >> >    Object localvar = weakref.get();
> >> >    functionA(weakref.get());
> >> > 2, assign it to another object slot, such as
> >> >    obj.field = weakref.get();
> >> > these operations will change the reachability of referent object,
> which
> >> may
> >> > cause the inconsistency issues in concurrent GC. In order to deal with
> >> it,
> >> > what we should do is to intercept these operations while application
> >> thread
> >> > is running. Because all these operations use the get() method to get
> the
> >> > reference of referent object, inserting the read barrier in the get()
> is
> >> a
> >> > simple and effective approach. The main job of the read barrier is to
> add
> >> > the pointer of referent object to a remembered set, then the garbage
> >> > collection thread will pick it up and trace from it.
> >> >
> >> > I divided the work into 3 major steps:
> >> >
> >> > The first step is to implement the read barrier for get() method of
> >> > reference object. My approch is implementing the barrier using a
> VMHelper
> >> > method, which is a VMMagic based helper function writen in Java. That
> is,
> >> > using Java to write some hot funtions in JVM, then these Java methods
> can
> >> be
> >> > compilered and even inlined into the native code produced by JIT
> compiler
> >> > [4]. This is very useful approach to reduce the read barrier’s
> overhead.
> >> In
> >> > this step, I will write a Java version read barrier and add some code
> to
> >> the
> >> > JIT compiler (Jitrino and JET) to make it compile the barrier
> correctly
> >> and
> >> > effiently.
> >> >
> >> > The second step is to implement the weakref processing module in Tick.
> >> This
> >> > module will collect all the weakrefs while concurrent marking and put
> >> them
> >> > to different queues depending on its reference type. When concurrent
> >> marking
> >> > finishes, GC threads pick up the reference object in the queues to see
> if
> >> > the referent object is marked, if so, the referent object is live and
> >> should
> >> > not be processed, otherwise, process the referent object according the
> >> type
> >> > of the reference object respectively. moreover, if the reference
> object
> >> is
> >> > intercepted by the read barrier in the get() method, its referent
> object
> >> > should be marked while tracing, so the consistency is well maintained.
> >> >
> >> > After this step, I will test Tick with benchmark using weakrefs, or
> using
> >> > data structure based on the weakrefs (such as WeakHashMap). The
> >> experiment
> >> > result will be analyzed and compared to Tick without weakrefs
> supporting
> >> and
> >> > generational GC with weakrefs supporting. Then I will form the result
> >> data
> >> > to a document report.
> >> >
> >> > The last step is trying to reduce the floating garabge [5] for
> concurrent
> >> > GC. This work is not related to weakrefs feature, but its effect may
> be
> >> > similar to it, since the major goal of reducing floating garbage is to
> >> ease
> >> > the memory pressure while using concurrent GC. Floating garbage always
> >> exist
> >> > and may not be totally eliminated in concurrent GC, my job is trying
> to
> >> > optimize its impact as much as possible.
> >> >
> >> > *Additional Information:*
> >> >
> >> > Personal Information:
> >> >  I am a master canidate of Computer Science in Parallel Processing
> >> > Institution, Fudan University. My interested area includes Compiler,
> >> > Computer Architecure, Managed Runtime etc. I started to follow Harmony
> >> with
> >> > interests 2 year ago, and I used to contribute to DRLVM and have
> submited
> >> > some patches to the community. As an intern, I improved Harmony
> >> concurrent
> >> > GC by using state-based phase design and implementing an efficient
> >> scheduler
> >> > in the summer of 2008, the patch of this work has been merged to the
> >> source
> >> > trunk [6].
> >> >
> >> > Schedule:
> >> >
> >> >  I can work at least 3 days per week for this work, my schedule this
> >> > project as follow:
> >> >  April 10 ~ April 26 Communicate with the mentor and design the
> interface
> >> > of the implementation, such as, the native interface of get() method,
> >> weak
> >> > reference processing interfaces in GC module. Write down the design
> >> > document.
> >> >
> >> >  April 27 ~ May 24 implementing read barrier for get() method of
> >> reference
> >> > object.
> >> >
> >> >  May 25 ~ June 30 implementing the weak and soft reference processing
> >> > features in GC module, add finalizer feature to Tick
> >> >
> >> >  July 1 ~ July 15 implementing phantom reference processing features.
> >> >
> >> >  July 16 ~ Aug 1 test and analyze Tick with weak reference related
> >> > benchmark. write document for implementing and testing.
> >> >
> >> >  Aug 1 ~ Aug 15 try to improve Tick by reducing floating garbage.
> >> >
> >> > *References:*
> >> >
> >> > [1]
> >> >
> >>
> http://xiao-feng.blogspot.com/2007/03/comparison-between-snapshot-based-gc.html
> >> > [2] http://www.realjenius.com/node/377
> >> > [3] http://www.pawlan.com/Monica/refobjs/
> >> > [4]
> >> >
> >>
> http://mail-archives.apache.org/mod_mbox/harmony-dev/200505.mbox/%3C4293E6BD.1060303@anu.edu.au%3E
> >> > [5] http://www.memorymanagement.org/glossary/f.html#floating.garbage
> >> > [6] http://issues.apache.org/jira/browse/HARMONY-5989
> >> >
> >> > --
> >> > From : Simon.Zhou@PPI, Fudan University
> >> >
> >>
> >>
> >>
> >>  --
> >> http://people.apache.org/~xli <http://people.apache.org/%7Exli>
> >>
> >
> >
> >
> > --
> > From : Simon.Zhou@PPI, Fudan University
> >
>
>
>
> --
> http://people.apache.org/~xli <http://people.apache.org/%7Exli>
>



-- 
>From : Simon.Zhou@PPI, Fudan University

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message