harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Garner <robin.gar...@anu.edu.au>
Subject Re: [drlvm] Class unloading support - tested one approach
Date Fri, 10 Nov 2006 06:22:44 GMT
Ivan Volosyuk wrote:
> On 11/9/06, Etienne Gagnon <egagnon@sablevm.org> wrote:
>> Ivan Volosyuk wrote:
>> > We will get rid of false sharing. That's true. But it still be quite
>> > expensive to write those '1' values, because of ping-ponging of the
>> > cache line between processors. I see only one solution to this: use
>> > separate mark bits in vtable per GC thread which should reside in
>> > different cache lines and different from that word containing gcmap
>> > pointer.
>> The only thing that a GC thread does is write "1" in this slot; it never
>> writes "0".  So, it is not very important in what order (or even "when")
>> this word is finally commited to main memory.  As long as there is some
>> barrier before the "end of epoch collection" insuring that all
>> processors cache write buffers are commited to memory before tracing
>> vtables (or gc maps).
>> You don't need memory coherency on write-without-read. :-)
> I don't speak about memory coherency, I speak about bus load with
> useless memory traffic between processors and poor CPU cache usage.
Surely this wouldn't happen in a sufficiently weak memory model ?  Lets 
just not support x64 :-)

But I think this false sharing may be what kills this particular idea.
The next cheapest option should be to use a side array of bytes - as 
long as calculating the address of the mark byte can be done without any 
loads or register spills, it should still be cheaper than a full 
test-and-mark operation on the vtable.  I guess there are cache policies 
where this may still be slow on an SMP machine.

Side metadata is easiest to do when objects are in a specific space, and 
has coarse alignment.  Any ideas what the typical size of a DRLVM vtable 
is ?  Would 256 bytes be an excessive alignment boundary ?

I'll try it out in the next day or so.  Unfortunately I don't have 
access to anything with more parallelism than a Pentium D, so it's not 
likely to be conclusive.

Robin Garner
Dept. of Computer Science
Australian National University

View raw message