harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregory Shimansky <gshiman...@gmail.com>
Subject Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?
Date Wed, 28 Feb 2007 21:38:48 GMT
On Wednesday 28 February 2007 23:28 Weldon Washburn wrote:
> On 2/28/07, Gregory Shimansky <gshimansky@gmail.com> wrote:
> > Weldon Washburn wrote:
> > > On second thought, the only way I know to implement volatile long
> >
> > (64-bit)
> >
> > > Java variables on ia32 is:
> > >
> > > grab critical section
> > > mov [ecx], low32bits;   // to do a write, the code for doing a read is
> > > similar
> > > mov[ecx+4], hi32bits;
> > > release critical section
> >
> > Is it possible for 64-bit atomic load stores to use double load/stores
>
> hmm... can you tell us the specific instructions you are suggesting?  I see
> quad loads/stores but can't find the double load/store version.  I also
> tried to find the guarantees on bus transactions.  Somewhere I recall it is
> documented that 4-byte aligned loads/stores are guaranteed to be atomic.
> Maybe there are some new guarantees on 64-bit writes.  In any case, we
> would still have to be compatible with existing Pentium III hardware and
> probably have to go with some sort of critical section approach.

Yes this is true. I hoped that someone would point out exactly if there are 
any 64-bit atomic operations that work with doubles. It seems like there 
aren't because the patch by Ivan in HARMONY-2092 has comments that it is 
enough to change GC and class loader to align objects on 64-bits boundary and 
that's enough for 64-bit load/stores but only with memory fence instructions 
in interpreter in addition.

> > or SSE4 on the processors that have it?
>
> Good point.  I recall old versions were really only focused on multimedia.
> And writing multimedia bits to memory is not sensitive to order or
> atomicity.  In other words, if you are writing to a frame buffer, speed of
> writes is important but the order the bits hit the buffer is not.  Again, I
> looked but could not find the latest info SSE4 and atomicity.

Actually it should have been SSE2. I pressed a wrong digit. I just meant quad 
load/stores when I wanted to mention it.

> > Some observations:
> > > 1)
> > > Fixing the "volatile long" bug (Harmony-2092) by using critical section
> >
> > as
> >
> > > above should, as a side-effect, allow DekkerTest.java to run.
> > > 2)
> > > Using volatile long sort of, kind of defeats a major reason to use
> >
> > Dekker
> >
> > > algorithm in the first place.  Why bother if the performance is the
> > > same
> >
> > as
> >
> > > using critical sections?
> > > 3)
> > > Using "volatile int" in DekkerTest.java probably still fails because
> >
> > reads
> >
> > > can pass writes.  One way to fix this might be to make the JIT emit r/w
> > > memory fence whenever reading/writing the volatile int.  While memory
> > > fences
> > > are often cheaper than HW locks, they are not free.
> > > 4)
> > > My guess is that there are no old legacy Java apps that use Dekker
> > > algorithm.  In other words, nobody is dependant on Dekker algorithm
> > > working.  My guess is that they are, however, dependent on volatile
> > > long and
> > > volatile int working properly. (which has the side effect of making
> >
> > Dekker
> >
> > > algo work.)
> > >
> > > On 2/21/07, Weldon Washburn <weldonwjw@gmail.com> wrote:
> > >> On 2/21/07, Gregory Shimansky <gshimansky@gmail.com> wrote:
> > >> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> > >> > > Weldon,
> > >> > >   But I am not sure why the behavior would be different from
J9 on
> > >>
> > >> the
> > >>
> > >> > same
> > >> >
> > >> > > hardware. Do we jit volatiles differently?
> > >>
> > >> The differences in behavior can be caused by lots of things that are
> >
> > not
> >
> > >> related to memory model.  For example the JIT might actually emit
> >
> > slighly
> >
> > >> different code.  Slighly different code can easily open/close race
> > >> conditions.  The important concept is that both J9 and drlvm fail.
> > >> And the
> > >> failure appears to be because modern hardware is most likely not
> > >> designed to
> > >> run Dekker's algo without memory fences.
> > >>
> > >> There is a bug on DRLVM about volatile variables HARMONY-2092. It is
> > >> about
> > >>
> > >> > long and double type variables assignments. Is it the same as in
> > >> > Dekker's
> > >> > algorithm?
> > >>
> > >>  DekkerTest.java uses "long" variables.  Yes, this could change the
> >
> > rate
> >
> > >> of failure but not eliminate failures completely.
> > >>
> > >> > On 2/20/07, Weldon Washburn <weldonwjw@gmail.com> wrote:
> > >> > > > It seems Dekker's algorithm is not expected to work on SPARC
or
> > >>
> > >> IA32
> > >>
> > >> > SMP
> > >> >
> > >> > > > boxes unless memory fences are used.  DekkerTest.java in
> > >> >
> > >> > Harmony-2986
> > >> >
> > >> > > > does not contain memory fences.  The volatile keyword guarantees
> > >>
> > >> the
> > >>
> > >> > > > compiler will write a given variable to memory.  However,
the HW
> > >>
> > >> may
> > >>
> > >> > > > actually have a
> > >> > > > write buffer and allow reads to pass writes.  As far as
I know,
> >
> > the
> >
> > >> > Java
> > >> >
> > >> > > > language does not provide a means to invoke a memory fence.

> > >> > > > Thus
> > >> >
> > >> > there
> > >> >
> > >> > > > is no way to fix up DekkerTest.java.  I may be misunderstanding
> > >> >
> > >> > something
> > >> >
> > >> > > > here.  Does anyone have comment?
> > >> > > >
> > >> > > > An excellent description of the issues involved is in a
David
> >
> > Dice
> >
> > >> > > > presentation at:
> > >> > > >
> > >> > > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> > >> > > >
> > >> > > > --
> > >> > > > Weldon Washburn
> > >> > > > Intel Enterprise Solutions Software Division
> > >> >
> > >> > --
> > >> > Gregory
> >
> > --
> > Gregory

-- 
Gregory

Mime
View raw message