Hi there,
I had instrumented the current implementation of j.u.IdentityHashMap
and ran it through ThreadLocalBench and SerialBench. Surprisingly, the
collision rate there (frequency of sutiations when objects have
identical identityHashCode) is rather high - 8 collisions per one
get() on average with load factor of 0.75.
That mean identityHashCode does not disperse elements well and the
map's performance is subject to clustering. Moreover, the collision
chains are sometimes so big that they're covering a half of storage
array and IHM looses access performance degrading to linear search.
And also the performance of essentially the same IHMs (in terms of
contents) may vary because of different clustering layouts.
That's essentially the disadvantage of open-addressed hash tables and
there are known ways to overwhelm this problem [1]. But before
actually researching on this topic I need to answer some
"java-spec-legal" questions:
1. Is it required for IdentityHashMap to use open addressing?
2. Is it required for IdentityHashMap to use linear probing as
collision resolution scheme?
3. What's the reason of having the separate implementation of
IdentityHashMap while it can be implemented with overriding of
"equality" operation in ordinary HashMap?
4. Can identityHashCode be salted with some additional hash to break
clustering?
I'm actually interested in performance optimization of IHM, so I will
appreciate anyone comments on this topic.
Thanks,
Aleksey.
[1] http://en.wikipedia.org/wiki/Hash_table