harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Senaka Fernando" <senaka...@gmail.com>
Subject Re: [harmony-gc-5]updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]
Date Sun, 13 Apr 2008 12:11:53 GMT
Hi Xiao-Feng,

Here is a detailed answer to your questions. I have copy-pasted some info
from Parrot documentation on GC.

Objects on the heap are laid out as PMCs (PObjects), and buffers. Allocation
is pool based. The Arenas structure holds function pointers for the core
defined interface of the currently active GC subsystem: "init_pool",
"do_gc_mark",        "finalize_gc_system". It holds various accounting
information for the GC subsystem, including how many GC runs have been
completed, amount of memory allocated since the last run, and total memory
allocated. This accounting information is updated by the GC system. The
current block level for GC mark and sweep phases is stored in the Arenas
structure.

The Memory_Pool structure is a simple memory pool. It contains a pointer to
the top block of the allocated pool, the total allocated size of the pool,
the block size, and some details on the reclamation characteristics of the
pool.

The Small_Object_Pool structure is a richer memory pool for object
allocation.  It tracks details like the number of allocated and free objects
in the pool, a list of free objects, and for the generational GC
implementation maintains linked lists of white, black, and gray PMCs. It
contains a pointer to a simple Memory_Pool (the base storage of the pool).
It holds function pointers for adding and retrieving free objects in the
pool, and for allocating objects.

Each PMC/Buffer will contain a set of flags that will govern the behavior
and state of it in the presence of the GC.
Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG, PObj_live_FLAG
etc.

Each PObject has a header which is of type UnionVal, a union of various
fields, in addition to flags. A PMC has a Vtable in it. Thus, each allocated
object will have header info within it.

Each GC core defines 4 function pointers stored in the Small_Object_Pool
structures. One to allocate new objects, another to add a freed object to
the free list, another to retrieve a free object from the free list and one
to reallocate for additional objects. If a Small_Object_Pool is full (no
free objects) a new one will needed to be created. Thus, each object on a
small object pool is like a place holder for a new instance.

Heap is laid out as arenas, having two memory pools and six small object
pools.

There are two marking phases, for PMCs

1. Initial Marking
Each PMC has a "flags" member which, among other things, facilitates garbage
collection. At the beginning of the mark phase, the "PObj_is_live_FLAG" and
"PObj_is_fully_marked_FLAG" are both unset, which flags the PMC as presumed
dead (white). The initial mark phase of the collection cycle goes through
each PMC in the root set and sets the Obj_is_live_FLAG" bit in the "flags"
member (the PMC is gray).  It does not set the "PObj_is_fully_marked_FLAG"
bit (changing the PMC to black), because in the initial mark, the PMCs or
buffers contained by a PMC are not marked. It also appends the PMC to the
end of a list used for further marking. However, if the PMC has already been
marked as black, the current end of list is returned (instead of appending
the already processed PMC) to prevent endless looping.

2. Incremental Marking
After the root set of PMCs have been marked, a series of incremental mark
runs are performed. These may be performed frequently, between other
operations.  The incremental mark runs work to move gray PMCs to black. They
take a PMC from the list for further marking, mark any PMCs or buffers it
contains as gray (the "PObj_is_live_FLAG" is set and the
"PObj_is_fully_marked_FLAG" is left unset), and add the contained PMCs or
buffers to the list for further marking.  If the PMC has a custom mark
function in its vtable, it is called at this point.

For Buffers, no incremental marking is involved.
The initial marking phase also marks the root set of buffers. Because
buffers cannot contain other buffers, they are immediately marked as black
and not added to the list for further marking. Because PMCs may contain
buffers, the buffer collection phase can't run until the incremental marking
of PMCs is completed.

When the list for further marking is empty (all gray PMCs have changed to
black), the collection stage is started. First, PMCs are collected, followed
by buffers. In both cases (PMC and buffer), the "live" and "fully_marked"
flags are reset after examination for reclamation.

To collect PMCs, each PMC arena is examined from the most recently created
backwards.  Each PMC is examined to see if it is live, already on the free
list, or constant.  If it is not, then it is added to the free list and
marked as being on the free list with the "PObj_on_free_list_FLAG".

To collect buffers, each Buffer arena is examined from the most recently
created backwards.  If the buffer is not live, not already on the free list
and it is not a constant or copy on write, then it is added to the free pool
for reuse and marked with the "PObj_on_free_list_FLAG".

Thus, the objects are scanned during the mark phase and then identified as
live, and collected. The collection process is triggered after the marking
is complete.

Allocation of objects is handled by pool structures.

I can relate the necessary source code portions to this discussion if
required. However, some of the above mentioned features are not fully
implemented. But, according to the Parrot community, this will be their
future direction.

I also have fixed most build errors with Parrot & C++ and they are really
happy about it. Now am in the process of resolving some linking conflicts on
Parrot's C++ build.

Regards,
Senaka

On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <senakafdo@gmail.com>
wrote:

> Hi Xiao-Feng,
>
> Thanks for these questions. I believe that they'd be really helpful in
> understanding VM <-> GC assumptions on the Parrot end.
>
> Will work on answering these, and perhaps a comparison with Harmony.
>
> Regards,
> Senaka
>
>
> On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <xiaofeng.li@gmail.com>
> wrote:
>
> > Senaka, thanks for the page. I think the most important things are
> > related to the VM <-> GC protocol. Some questions that may help you:
> > 1. How Parrot layout/encode an object/array? fields, size, object
> > header info, etc.
>
>
> > 2. How Parrot layout/arrange the heap? free list? pages?
> > 3. What's the process of an object creation? When and how?
> > 4. How is a collection process triggered?
> > 5. How does Parrot GC trace live objects and collect them?
> >
> > Some of the questions above might be GC internals, so it's more
> > desirable if you can understand the Parrot VM's assumptions on GC.
> > I.e., does it assume the heap is laid out in certain way, does it
> > assume the objects are encoded in certain way, does it assume the
> > roots are enumerated in certain way, etc.? Depending on your progress,
> > more details might be needed later on.
> >
> > For this project, you might have to understand the Parrot current
> > status for the above questions. It helps you and us to identify the
> > key issues to resolve, and the main efforts to be focused on. For GC
> > porting over different VMs, it's not like porting an application over
> > different OSes, because of the implicit assumptions between VM and GC.
> > I would expect some redesign work required for GC porting, hence you
> > have to understand the Parrot design in certain depth.
> >
> > Thanks,
> > xiaofeng
> >
> > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <alexei.fedotov@gmail.com>
> > wrote:
> > > Good job, Senaka.
> > >
> > >  The general perception was that internal and external GC interfaces
> > >  were mixed, which maked this document less usable for harmony-gc-5
> > >  project than it could be. For example, sweeping, marking and
> > >  reclaiming are internal interfaces while allocation, stack
> > enumeration
> > >  (please take a look at vm_enumerate_root_set_all_threads) and gc
> > >  invocation are external interfaces. We should pay more attention to
> > >  external interfaces for harmony-gc-5 project.
> > >
> > >  Thanks.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <senakafdo@gmail.com>
> > wrote:
> > >  >
> > >  > Hi all,
> > >  >
> > >  >  I have almost finished comparing the two interfaces of Harmony and
> > Parrot.
> > >  >  However, I'm not 100% sure on whether I got everything right, but
> > I believe
> > >  >  that most of it is. Therefore, it would be really great if you
> > could review
> > >  >  the wiki page and let me know whether it is correct and precise.
> > >  >
> > >  >  Sections marked as TBD are not yet finalized. And, I have omitted
> > some
> > >  >  instances of where Harmony supports something and Parrot doesn't
> > for
> > >  >  simplicity.
> > >  >
> > >  >  The wiki page is found at [1]
> > >  >
> > >  >  [1]
> > http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
> > >  >
> > >  >  Regards,
> > >  >  Senaka
> > >  >
> > >
> > >
> > >
> > >  --
> > >  With best regards,
> > >  Alexei
> > >
> >
> >
> >
> > --
> > http://xiao-feng.blogspot.com
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message