harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evgueni Brevnov" <evgueni.brev...@gmail.com>
Subject Re: [jira] Updated: (HARMONY-2092) [drlvm][jit] Harmony works with volatile variables incorrectly
Date Fri, 01 Jun 2007 12:04:24 GMT
AFAIU, spec requires that two writes into different volatile variables
appear in the program order. To guarantee that we need to use
read/write barriers at least. I'm not sure if spec makes stronger
requirement which forces us to lock the system BUS. Anyway using
CMPXCHG8 with lock prefix seems to be worth trying. At least from
program correctness point of view.

Thanks
Evgueni

On 6/1/07, Xiao-Feng Li <xiaofeng.li@gmail.com> wrote:
> On 6/1/07, Mikhail Fursov <mike.fursov@gmail.com> wrote:
> > Will we plan making objects aligned by 8-bytes in Q3?
> > AFAIU this is the only way to avoid lock prefix and performance degradation
> > and does not require big changes in GC: we need to have objects have size of
> > multiple of 8 and every memory area allocated by GC to be aligned by 8. Do I
> > miss something here?
> >
> > It can be less work then making temporary workarounds in JIT instead of
> > simple XMM moves we already have.
>
> Mikhail, to align all the objects at 8-byte boundary is indeed an easy
> solution, but it may cause some space overhead (compared to 4-byte
> boundary alignment). The space overhead in turn may lead to
> performance degradation. This actually can be experimented quickly to
> see if it indeed causes visible performance drop with representative
> workloads and benchmarks.
>
> The other solution is to align only certain classes' instances at
> 8-byte boundary, for example, those with volatile long fields. But
> this is not a small change in GC, needing longer time and thorough
> testing.
>
> Probably I can try with the all 8-byte alignment at first to to help
> us to make final decision.
>
> Thanks,
> xiaofeng
>
> > On 6/1/07, Pavel Ozhdikhin <pavel.ozhdikhin@gmail.com> wrote:
> > >
> > > On 6/1/07, Weldon Washburn <weldonwjw@gmail.com> wrote:
> > > > On 31 May 2007 00:52:00 +0400, Egor Pasko <egor.pasko@gmail.com>
wrote:
> > > > >
> > > > > On the 0x2E6 day of Apache Harmony Xiao-Feng Li wrote:
> > > > > > On 5/30/07, George Timoshenko <george.timoshenko@gmail.com>
wrote:
> > > > > > >
> > > > > > > > I had a question in the JIRA about this issue: why
don't we use
> > > > > "lock"
> > > > > > > > prefix for the atomic access?
> > > > > > >
> > > > > > > well...
> > > > > > >
> > > > > > > Originally we split all 64-bit memory access into 2 ones
of
> > > 32-bit.
> > > > > > > It does not have sense to set #LOCK prefix for them. (there
is a
> > > gap
> > > > > > > between)
> > > > > > >
> > > > > > > We can only set #LOCK to some instruction that reads/writes
whole
> > > 64
> > > > > bits.
> > > > > > >
> > > > > > > The bad thing is the only instruction (according to IA32
spec) we
> > > can
> > > > > > > set #LOCK to is CMPXCHG8B (MOVQ, MOVSD and any others can
not be
> > > used
> > > > > > > with #LOCK)
> > > > > > >
> > > > > > > This monster (CMPXCHG8B) requires 4 registers:
> > > > > > >
> > > > > > > EAX
> > > > > > > EBX
> > > > > > > ECX
> > > > > > > EDX
> > > > > > >
> > > > > > > and (FLAGS) also.
> > > > > > >
> > > > > > > I am not sure CMPXCHG8B usage will be faster than making
volatile
> > > > > fields
> > > > > > >    always synchronized (artificially)
> > > > > >
> > > > > > George, I believe it should be much faster than synchronized
block,
> > > > > > since it is non-blocking with contended locks. To use compxchg,
you
> > > > > > need a loop to check the return result till it succeeds. With
> > > > > > synchronized block, the thread will go to sleep till being waken
up
> > > by
> > > > > > the releasing thread.
> > > > >
> > > > > hm, if I am not mistaken most of the time that would be a spin lock
> > > > > with the current thread manager. So, I cannot not bet which way is
> > > > > faster. Maybe, some expert in TM can tell for sure?
> > > >
> > > >
> > > > This kind of stuff is always emprical. The task is to build, measure,
> > > post
> > > > the results.  The wild cards are the workload and the
> > > hardware.  Different
> > > > combos will lead to different conclusions.
> > > >
> > > > Having said the above, my hunch is to go with CMPXCHG8B for right
> > > now.  The
> > > > main motivation is that this decouples register assignment from the jvm
> > > > thread subsystem thus makes things easier to debug.  This is
> > > goodness.  Also
> > > > running exhaustive studies of different workloads, different platforms
> > > is
> > > > not something of high value for a JVM at such an early stage of
> > > > development.  In other words, do this analysis once we get real
> > > workloads
> > > > like specjappserver running.  As already noted, it should be easy to
> > > > re-implement when the time is right.
> > > >
> > > > Interesting background material --- From Jeremy Manson's "The Java
> > > Memory
> > > > Model", POPL 2005, section 2.3 it says, "In order to allow for
> > > non-blocking
> > > > techniques that communicate between threads, we also want to allow the
> > > use
> > > > of _volatile_ variables to synchronize information between threads.  The
> > > > properties of volatile variables arose from the need to provide a way
to
> > > > communicate between threads without the overhead of ensuring mutual
> > > > exclusion."  While this does not dictate a solution, it sort of suggests
> > > > using opcodes (lockxxx) instead of bytecodes (monenter/exit).
> > >
> > > Adding monenter/monexit pair in the place where the author of the code
> > > did not intended to put them may lead to deadlock. So, I'm +1 for
> > > prototyping with CMPXCHG8B  first.
> > >
> > > Thanks,
> > > Pavel
> > >
> > > >
> > > >
> > > > Anyway, both implementations do not seem to be very hard, we could try
> > > > > both ways...
> > > > >
> > > > > --
> > > > > Egor Pasko
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Weldon Washburn
> > > >
> > >
> >
> >
> >
> > --
> > Mikhail Fursov
> >
>
>
> --
> http://xiao-feng.blogspot.com
>

Mime
View raw message