harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pasko <egor.pa...@gmail.com>
Subject Re: [jira] Created: (HARMONY-6404) possible data-reordering in some hashCode-methods (e.g. String or URL)
Date Fri, 11 Dec 2009 14:32:50 GMT
On the 0x684 day of Apache Harmony Tim Ellison wrote:
> On 11/Dec/2009 04:09, Vijay Menon wrote:
>> Perhaps I'm missing some context, but I don't see any problem.  I don't
>> believe that this:
>> 
>>         if (hashCode == 0) {
>>             // calculate hash
>>             hashCode = hash;
>>         }
>>         return hashCode;
>> 
>> can ever return 0 (assuming hash is non-zero), regardless of memory fences.
>>  The JMM won't allow visible reordering in a single threaded program.
>
> I agree.  In the multi-threaded case there can be a data race on the
> hashCode, with the effect that the same hashCode value is unnecessarily,
> but harmlessly, recalculated.

Vijay, Tim, you are not 100% correct here.

1. there should be an extra restriction that the part "calculate hash"
   does not touch the hashCode field. Without that restriction more
   trivial races can happen as discussed in LANG-481.

So we should assume this code:

if (this.hashCode == 0) {
  int hash;
  if (this.hashCode == 0) {
    // Calculate using 'hash' only, not this.hashCode.
    this.hashCode = hash;
  }
  return this.hashCode;
}

where 'this.*' is always a memory reference while 'hash' is a thread private var.

2. DRLVM JIT indeed does not privatize field access to threads. It
   always loads fields from memory in original order. Hence this
   potential bug does not affect DRLVM .. now. But potentially it can
   start optimizing things this way because current behavior prevents
   a bunch of high level optimizations.

3. Jeremy Manson, being an expert in Java Memory Model claims [1] that
   such reordering is allowed theoretically. I.e. like this:

int hash = this.hashCode;
if (this.hashCode == 0) {
  hash = this.hashCode = // calculate hash
}
return hash;

This is a correct single-threaded code. What happened here is a
lengthy thread privatization of this.hashCode (again, does not happen
in DRLVM). That is incorrect in multithreaded environment and needs
extra synchronization/volatile/etc.

4. I do not know why a JIT would want to do that, i.e. two sequential
   reads from the same memory location. Sounds like a bit synthetic example.


[1] http://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html

-- 
Egor Pasko


Mime
View raw message