harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robin Garner" <robin.gar...@anu.edu.au>
Subject Re: [DRLVM][JIT] write barrier broken by new jit opts?
Date Tue, 09 Jan 2007 15:31:53 GMT
> On 09 Jan 2007 16:54:03 +0600, Egor Pasko <egor.pasko@gmail.com> wrote:
>>
>> On the 0x258 day of Apache Harmony Weldon Washburn wrote:
>> > It looks like multiple interesting design topics.  My comments inlined
>> > below.
>> >
>> [..snip..]
>> >
>> > > Hi, I found write barrier in DRLVM can't catch all reference fields
>> > > updates, and the problem is identified to be caused by new jit opts
>> > > that do not observe the write barrier invariant for fields updates.
>> > > For example, JIT may generate code to copy consecutive fields of an
>> > > object without invoking write barrier. In order to revive the write
>> > > barrier functionality while not sacrificing the opt performance,
>> >
>> > Yes.  But first how much performance is  being sacrificed?  .2%?  8%??

>>
>> JIT magic arraycopy gives up to 30% boost on a microbenchmark (see
>> HARMONY-2247). I did not measure boosts on specific benchmarks, but I
>> think users would expect System.arraycopy() to be pretty faster than
>> manual array copying. We should care about performance numbers as
>> such.
>
>
> Let me re-phrase the question.  Arraycopy performance is important and
> deserves the special case treatment it has always gotten.  Setting aside
> arraycopy, how much performance gain can be expected by optimizing
> consecutive writes to fields of an object for the benchmarks we care
> about?
> What about simply marking the consecutive writes regions as
> "Uninterruptible"?  This would eliminate yet another API between the GC
> and
> JIT.  I think this is basically the same as Robin's suggestion.
>
> Regarding arraycopy, is there a problem with making the entire arraycopy
> loop "Uninterruptible"?  This will impact GC latency but is the impact a
> big
> deal for workloads we care about?  If it is, why not have the compiler
> unroll the loop a bunch and put WBs every, say, 10th write.  The body of
> 10
> writes would be Uninterruptible.

With arraycopy, much of the saving is in barrier costs themselves.  Apart
from the overhead on the write, there's a reduction in remset entries, and
the cost of scanning the object at GC time is minimal for a reference
array.

The original motivating benchmark was jess iirc.

Off the top of my head, if the barrier is called before any data is copied
I think the arraycopy code is GC-safe, provided another barrier call is
made after the GC and before the next pointer write.  It's probably better
to make the arraycopy code uninterruptible.

My gut feel is that scalars don't generally have enough pointers to make
the object remembering barrier worthwhile.

>> > I'd
>> > > suggest to introduce an object remember write barrier which will be
>> > > invoked after the object is copied. So the JIT doesn't need to
>> insert
>> > > barrier for each field store.
>> >
>> > hmm.... what happens if some other app thread causes a GC to happen in
>> the
>> > middle of writing a bunch of fields of a given object?  If the
>> > gc_heap_wrote_object() is called before JITed code scribbles on slots,
>> then
>> > the early slots will be scanned and handled properly by the GC.   But
>> how do
>> > we handle the slots written after the GC completes?  One approach
>> would
>> be
>> > for the JIT to mark such regions of emitted code "Uninterruptable".
>> > Another approach would be to emit a WB both before and after a region
>> of
>> > multiple ref field scribbles.   In any case, it looks like we need
>> patch
>> up
>> > the holes in the contract between jit and gc.  However, as I said
>> above
>> is
>> > there anything wrong with a real simple dumb contract for now?  That
>> is
>> each
>> > ref write has a matching WB with no intervening instructions.
>> >
>> > >GC has an interface
>> > > gc_heap_wrote_object(p_obj) for this case.  I think it's ok to
>> insert
>> > > only the runtime native call at first. Then later we can consider to
>> > > inline the object remembering barrier as well as the slot
>> remembering
>> > > barrier.
>> > >
>> > > Thanks,
>> > > xiaofeng
>> > >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Weldon Washburn
>> > Intel Enterprise Solutions Software Division
>>
>> --
>> Egor Pasko
>>
>>
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>



Mime
View raw message