harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Fursov" <mike.fur...@gmail.com>
Subject Re: [jira] Updated: (HARMONY-2092) [drlvm][jit] Harmony works with volatile variables incorrectly
Date Fri, 01 Jun 2007 08:49:28 GMT
Will we plan making objects aligned by 8-bytes in Q3?
AFAIU this is the only way to avoid lock prefix and performance degradation
and does not require big changes in GC: we need to have objects have size of
multiple of 8 and every memory area allocated by GC to be aligned by 8. Do I
miss something here?

It can be less work then making temporary workarounds in JIT instead of
simple XMM moves we already have.

On 6/1/07, Pavel Ozhdikhin <pavel.ozhdikhin@gmail.com> wrote:
>
> On 6/1/07, Weldon Washburn <weldonwjw@gmail.com> wrote:
> > On 31 May 2007 00:52:00 +0400, Egor Pasko <egor.pasko@gmail.com> wrote:
> > >
> > > On the 0x2E6 day of Apache Harmony Xiao-Feng Li wrote:
> > > > On 5/30/07, George Timoshenko <george.timoshenko@gmail.com> wrote:
> > > > >
> > > > > > I had a question in the JIRA about this issue: why don't we
use
> > > "lock"
> > > > > > prefix for the atomic access?
> > > > >
> > > > > well...
> > > > >
> > > > > Originally we split all 64-bit memory access into 2 ones of
> 32-bit.
> > > > > It does not have sense to set #LOCK prefix for them. (there is a
> gap
> > > > > between)
> > > > >
> > > > > We can only set #LOCK to some instruction that reads/writes whole
> 64
> > > bits.
> > > > >
> > > > > The bad thing is the only instruction (according to IA32 spec) we
> can
> > > > > set #LOCK to is CMPXCHG8B (MOVQ, MOVSD and any others can not be
> used
> > > > > with #LOCK)
> > > > >
> > > > > This monster (CMPXCHG8B) requires 4 registers:
> > > > >
> > > > > EAX
> > > > > EBX
> > > > > ECX
> > > > > EDX
> > > > >
> > > > > and (FLAGS) also.
> > > > >
> > > > > I am not sure CMPXCHG8B usage will be faster than making volatile
> > > fields
> > > > >    always synchronized (artificially)
> > > >
> > > > George, I believe it should be much faster than synchronized block,
> > > > since it is non-blocking with contended locks. To use compxchg, you
> > > > need a loop to check the return result till it succeeds. With
> > > > synchronized block, the thread will go to sleep till being waken up
> by
> > > > the releasing thread.
> > >
> > > hm, if I am not mistaken most of the time that would be a spin lock
> > > with the current thread manager. So, I cannot not bet which way is
> > > faster. Maybe, some expert in TM can tell for sure?
> >
> >
> > This kind of stuff is always emprical. The task is to build, measure,
> post
> > the results.  The wild cards are the workload and the
> hardware.  Different
> > combos will lead to different conclusions.
> >
> > Having said the above, my hunch is to go with CMPXCHG8B for right
> now.  The
> > main motivation is that this decouples register assignment from the jvm
> > thread subsystem thus makes things easier to debug.  This is
> goodness.  Also
> > running exhaustive studies of different workloads, different platforms
> is
> > not something of high value for a JVM at such an early stage of
> > development.  In other words, do this analysis once we get real
> workloads
> > like specjappserver running.  As already noted, it should be easy to
> > re-implement when the time is right.
> >
> > Interesting background material --- From Jeremy Manson's "The Java
> Memory
> > Model", POPL 2005, section 2.3 it says, "In order to allow for
> non-blocking
> > techniques that communicate between threads, we also want to allow the
> use
> > of _volatile_ variables to synchronize information between threads.  The
> > properties of volatile variables arose from the need to provide a way to
> > communicate between threads without the overhead of ensuring mutual
> > exclusion."  While this does not dictate a solution, it sort of suggests
> > using opcodes (lockxxx) instead of bytecodes (monenter/exit).
>
> Adding monenter/monexit pair in the place where the author of the code
> did not intended to put them may lead to deadlock. So, I'm +1 for
> prototyping with CMPXCHG8B  first.
>
> Thanks,
> Pavel
>
> >
> >
> > Anyway, both implementations do not seem to be very hard, we could try
> > > both ways...
> > >
> > > --
> > > Egor Pasko
> > >
> > >
> >
> >
> > --
> > Weldon Washburn
> >
>



-- 
Mikhail Fursov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message