harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Lee" <crazy...@crazybob.org>
Subject Re: My first submission! java.lang.ThreadLocal
Date Fri, 06 Jun 2008 18:36:15 GMT
On Fri, Jun 6, 2008 at 10:09 AM, Aleksey Shipilev <
aleksey.shipilev@gmail.com> wrote:

> I'm not saying that your implementation is bad. On contrary, your
> implementation is much faster than original Harmony one. What I'm
> saying is, ThreadLocalBench is still 3x slower on Harmony/DRLVM than
> on Sun 1.6.0_05. That might be connected with poor codegeneration on
> Harmony JIT, some disabled optimizations there, etc. I don't know what
> the real cause is. That's the question to answer.

It sounds like you're comparing DRLVM's JIT compiler to Sun's. I'm not at
all surprised by the outcome--Sun has quite a head start.

> Leaving Harmony/DRLVM aside, it's wonderful that your implementation
> is going neck-to-neck with Sun's on another VM. So, why don't to
> experiment and beat them? :)

Sun's impl was finely tuned by Doug Lea, BTW. I do beat the RI in 3 tests,
they beat me in 2. When I say one impl beats another, we're really talking
about a few ns in either direction; there's really no point in comparing. I
think I've prioritized the correct use cases. I'm a little slower in tests
which instantiate and throw away millions of thread local variables. I don't
think that's the norm.

Besides raw throughput, I also beat the RI in a few more areas: more
agressive clean up, smaller memory footprint, and fewer allocations and
garbage collections.

I do plan to contribute it to the OpenJDK.

> From this point-of-view, I raised the
> question on rationale of having open-addressed + linear-probed hashmap
> instead of, say, chaining. Taking away caching opportunities, Knuth
> had analyzed the average lookup count for different probing schemes,
> and linear probing performed the baddest.

That doesn't apply here. Knuth used English words for keys. We generate well
dispersed hash codes. In practice, this implementation almost never actually
probes. When it does probe, it probes the same area of memory.

> Actually, we had recently
> moved IdentityHashMap from the exactly the same addressing to chaining
> [1], and results are pretty good. There's additional +80% on
> ThreadLocalBench and I expect we can get them on your implementation
> too!

In general, you should favor open addressing over chaining: less
indirection, fewer allocations, smaller memory footprint, better memory
locality (if you use linear probing). My guess is that IdentityHashMap just
had a flawed impl of open addressing--it's very easy to get something wrong.

I could have used quadratic probing, double hashing, or cuckoo hashing, but
all of those are aimed at avoiding clustering. I generate well dispersed
hash codes in the first place, so they would have just added more complexity
and reduced memory locality with no benefit.

Obviously, if you're using externally provided hash codes, you have to use
chaining, but that's not the case here.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message