harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiao-Feng Li <xiaofeng...@gmail.com>
Subject Re: [GSoC] Proposal of GC-1: implement WeakReference support in Harmony concurrent GC
Date Wed, 01 Apr 2009 12:20:59 GMT
Simon, I've got some time reviewing your proposal. It looks very
feasible. Probably you'd better read my blog entry on WeakRef and
Finalizer processing of Harmony [1].

One question, will you implement weakref support for all the three
concurrent GC algorithm?

Btw, please go ahead to submit/update your proposal to the GSoC
application web.

[1] http://xiao-feng.blogspot.com/2007/05/weak-reference-processing-in-apache.html

Thanks,
xiaofeng

On Tue, Mar 31, 2009 at 6:06 PM, Simon Zhou <simon.harmony@gmail.com> wrote:
> Hi All,
> Here is my proposal, any suggestion is welcome!
>
> *Title/Summary:* GC-1: Implement WeakReference support in Harmony Concurent
> GC
>
> *Student:* Simon.Zhou
>
> *Student e-mail:* simon.harmony@gmail.com
>
> *Student Major:* Computer Science
>
> *Student Degree:* Master
>
> *Student Graduation:* 2009.7
>
> *Organization:* Fudan University
>
> *Assigned Mentor:*
>
> *Abstract:*
>
> weak reference allows a program to maintain special reference to objects,
> which is a useful API for the program interacting with the garbage
> collector to some extent. Apache Harmony has a concurrent garbage collector
> (code name: Tick) which can perform the garbage collection without
> suspending the application threads except for root set enumeration. Tick do
> not have weak reference supporting now, it treats all the references object
> just as  strong references, which may cause inefficiency for applications
> using weak references. In this project, I will add the weak reference
> support for Tick, which would be different from the implementation in
> generational GC, because the consistency should be maintained for the heap
> objects carefully. Read barrier of the get() method of reference object will
> be used for this implementation, and the performance issue will be seriously
> considered. before implement phantom reference, I will also implement the
> finalizer feature for Tick, although it is not recommanded to be used except
> it is necessary. Besides this, I am trying to reduce the floating garbage
> for Tick’s concurrent algorithms.
>
> *Detailed Description:*
>
> Tick is a concurrent GC of Harmony DRLVM, it can do the garbage collection
> without suspending application threads. For concurrent GC, the most
> important issue that should be dealt with is the inconsistency problem, that
> is if we trace the heap object graph while the mutator updating the same
> graph, inconsistency problem occurs, some live objects may be mistaken
> relaimed as garbage. In order to keep the consistency of objects, write
> barriers is inserted when concurrent tracing starts, which intercept all the
> write operations of object reference and record useful data for rescaning.
> Now there are 3 different concurrent collection algorithms in Tick: DLG,
> Sliding-View, Mostly-Concurrent; and 3 different write barriers are used
> repectively. see more information [1].
>
> Experiment result shows that Tick can significantly reduce or even eliminate
> the pause time caused by garbage collection. However, one drawback of Tick
> is that, it do not support weak reference mechanism, which is a important
> API of Java programming language.
>
> Weak reference is an useful tool for whom are much care about the
> application memory usage, e.g. it can help developers to cache some
> temporary large objects for later reusing; also, to do some cleanup
> operations just when a object is acturally relaimed. There are 3 kinds of
> weak reference class in Java language (the weakness is decreasingly): soft
> reference, weak reference, phantom reference. Soft reference is usually used
> to maintain better cache for large object, such as pictures load from web;
> weak reference can help to reduce the frequency of creating short-live
> objects; phantom reference can usully help programmer to do cleanup
> operations just when an object is physically relaimed. (from now on, I will
> use ‘weakref’ to represent the set of all the three kinds of weak
> references) All the three kinds of reference classes have a get() method, if
> the object has not been actually relaimed, the get() method will return a
> reference of the referent object (referent object is the object which
> reference object point to) except phantom reference (phantom reference’s
> get() method always return null, since it is just used for cleanup
> operations when object is actually relaimed). see more information [2,3]
>
> Weakrefs are always processed in GC. For generational GC, weakref feature
> can be implemented easily: when minor collection occurs, weak reference
> objects are equeued and their referents are reclaimed. when major collection
> occurs, soft reference objects are enqueued and their referents are
> reclaimed. when finalizer are processed, phantom reference objects are
> enqueued and their referents are reclaimed.
>
> Unlike generational GC, Tick just simply treats all the references in the
> heap as strong references, which make memory pressure become bigger for the
> applications using much data structure based on weakrefs, such as
> WeakHashMap. In this project, my job is to implement this weakref feature
> for Tick, and this work should be independent to the concurrent algorithms
> used in Tick.
>
> For concurrent GC, weakref processing is different with generational GC,
> because the tracing thread is running concurrently with application threads.
> Application threads may invoke the get() method to get the reference of
> referent object, it may:
> 1, put it to the thread local stack as local var or calling parameter, such
> as
>    Object localvar = weakref.get();
>    functionA(weakref.get());
> 2, assign it to another object slot, such as
>    obj.field = weakref.get();
> these operations will change the reachability of referent object, which may
> cause the inconsistency issues in concurrent GC. In order to deal with it,
> what we should do is to intercept these operations while application thread
> is running. Because all these operations use the get() method to get the
> reference of referent object, inserting the read barrier in the get() is a
> simple and effective approach. The main job of the read barrier is to add
> the pointer of referent object to a remembered set, then the garbage
> collection thread will pick it up and trace from it.
>
> I divided the work into 3 major steps:
>
> The first step is to implement the read barrier for get() method of
> reference object. My approch is implementing the barrier using a VMHelper
> method, which is a VMMagic based helper function writen in Java. That is,
> using Java to write some hot funtions in JVM, then these Java methods can be
> compilered and even inlined into the native code produced by JIT compiler
> [4]. This is very useful approach to reduce the read barrier’s overhead. In
> this step, I will write a Java version read barrier and add some code to the
> JIT compiler (Jitrino and JET) to make it compile the barrier correctly and
> effiently.
>
> The second step is to implement the weakref processing module in Tick. This
> module will collect all the weakrefs while concurrent marking and put them
> to different queues depending on its reference type. When concurrent marking
> finishes, GC threads pick up the reference object in the queues to see if
> the referent object is marked, if so, the referent object is live and should
> not be processed, otherwise, process the referent object according the type
> of the reference object respectively. moreover, if the reference object is
> intercepted by the read barrier in the get() method, its referent object
> should be marked while tracing, so the consistency is well maintained.
>
> After this step, I will test Tick with benchmark using weakrefs, or using
> data structure based on the weakrefs (such as WeakHashMap). The experiment
> result will be analyzed and compared to Tick without weakrefs supporting and
> generational GC with weakrefs supporting. Then I will form the result data
> to a document report.
>
> The last step is trying to reduce the floating garabge [5] for concurrent
> GC. This work is not related to weakrefs feature, but its effect may be
> similar to it, since the major goal of reducing floating garbage is to ease
> the memory pressure while using concurrent GC. Floating garbage always exist
> and may not be totally eliminated in concurrent GC, my job is trying to
> optimize its impact as much as possible.
>
> *Additional Information:*
>
> Personal Information:
>  I am a master canidate of Computer Science in Parallel Processing
> Institution, Fudan University. My interested area includes Compiler,
> Computer Architecure, Managed Runtime etc. I started to follow Harmony with
> interests 2 year ago, and I used to contribute to DRLVM and have submited
> some patches to the community. As an intern, I improved Harmony concurrent
> GC by using state-based phase design and implementing an efficient scheduler
> in the summer of 2008, the patch of this work has been merged to the source
> trunk [6].
>
> Schedule:
>
>  I can work at least 3 days per week for this work, my schedule this
> project as follow:
>  April 10 ~ April 26 Communicate with the mentor and design the interface
> of the implementation, such as, the native interface of get() method, weak
> reference processing interfaces in GC module. Write down the design
> document.
>
>  April 27 ~ May 24 implementing read barrier for get() method of reference
> object.
>
>  May 25 ~ June 30 implementing the weak and soft reference processing
> features in GC module, add finalizer feature to Tick
>
>  July 1 ~ July 15 implementing phantom reference processing features.
>
>  July 16 ~ Aug 1 test and analyze Tick with weak reference related
> benchmark. write document for implementing and testing.
>
>  Aug 1 ~ Aug 15 try to improve Tick by reducing floating garbage.
>
> *References:*
>
> [1]
> http://xiao-feng.blogspot.com/2007/03/comparison-between-snapshot-based-gc.html
> [2] http://www.realjenius.com/node/377
> [3] http://www.pawlan.com/Monica/refobjs/
> [4]
> http://mail-archives.apache.org/mod_mbox/harmony-dev/200505.mbox/%3C4293E6BD.1060303@anu.edu.au%3E
> [5] http://www.memorymanagement.org/glossary/f.html#floating.garbage
> [6] http://issues.apache.org/jira/browse/HARMONY-5989
>
> --
> From : Simon.Zhou@PPI, Fudan University
>



-- 
http://people.apache.org/~xli

Mime
View raw message