harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Varlamov" <alexey.v.varla...@gmail.com>
Subject Re: [classlib][luni][performance] IdentityHashMap implementation
Date Mon, 21 Apr 2008 04:50:19 GMT
2008/4/19, Aleksey Shipilev <aleksey.shipilev@gmail.com>:
> Good point, Nathan! It was DRLVM, but I'll try on IBM VME too.
>
> Anyway, I had incorporated this option in (4) question - can we just
> don't trust VM's System.identityHashCode and dope the hashcode with
> our own salt?

IMO this is mere impl detail and would not not violate specification,
so feel free to experiment. Below are most relevant parts of the spec:

"This class provides all of the optional map operations, and permits
null values and the null key. This class makes no guarantees as to the
order of the map; in particular, it does not guarantee that the order
will remain constant over time.

This class provides constant-time performance for the basic operations
(get and put), assuming the system identity hash function
(System.identityHashCode(Object)) disperses elements properly among
the buckets.

Implementation note: This is a simple linear-probe hash table, as
described for example in texts by Sedgewick and Knuth. The array
alternates holding keys and values. (This has better locality for
large tables than does using separate arrays.) For many JRE
implementations and operation mixes, this class will yield better
performance than HashMap (which uses chaining rather than
linear-probing)."

--
Alexey

>
>
> On Fri, Apr 18, 2008 at 11:10 PM, Nathan Beyer <nbeyer@gmail.com> wrote:
> > Did you run these tests with IBM's VME or DRLVM? IdendityHashMap's
> >  distribution of hash code is solely based on System.identityHashCode's
> >  results. We need to determine if the VM is doing the best thing first.
> >
> >  -Nathan
> >
> >  On Fri, Apr 18, 2008 at 1:53 PM, Aleksey Shipilev <
> >
> > aleksey.shipilev@gmail.com> wrote:
> >
> >
> >
> > > Hi there,
> >  >
> >  > I had instrumented the current implementation of j.u.IdentityHashMap
> >  > and ran it through ThreadLocalBench and SerialBench. Surprisingly, the
> >  > collision rate there (frequency of sutiations when objects have
> >  > identical identityHashCode) is rather high - 8 collisions per one
> >  > get() on average with load factor of 0.75.
> >  >
> >  > That mean identityHashCode does not disperse elements well and the
> >  > map's performance is subject to clustering. Moreover, the collision
> >  > chains are sometimes so big that they're covering a half of storage
> >  > array and IHM looses access performance degrading to linear search.
> >  > And also the performance of essentially the same IHMs (in terms of
> >  > contents) may vary because of different clustering layouts.
> >  >
> >  > That's essentially the disadvantage of open-addressed hash tables and
> >  > there are known ways to overwhelm this problem [1]. But before
> >  > actually researching on this topic I need to answer some
> >  > "java-spec-legal" questions:
> >  >  1. Is it required for IdentityHashMap to use open addressing?
> >  >  2. Is it required for IdentityHashMap to use linear probing as
> >  > collision resolution scheme?
> >  >  3. What's the reason of having the separate implementation of
> >  > IdentityHashMap while it can be implemented with overriding of
> >  > "equality" operation in ordinary HashMap?
> >  >  4. Can identityHashCode be salted with some additional hash to break
> >  > clustering?
> >  >
> >  > I'm actually interested in performance optimization of IHM, so I will
> >  > appreciate anyone comments on this topic.
> >  >
> >  > Thanks,
> >  > Aleksey.
> >  >
> >  > [1] http://en.wikipedia.org/wiki/Hash_table
> >  >
> >
>

Mime
View raw message