harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy,Jing Lv" <firep...@gmail.com>
Subject Re: [classlib] Some further study on hashcode and HashMap(Was Re: Performance improvement of java.util.HashMap)
Date Wed, 04 Jul 2007 08:14:48 GMT
Hi XiaoFeng,

     Ah, I may be explain myself very well here :) The bit shifting
algorism is NOT used for randomization itself, but for trying to
gather all infomation of 32 bits into log(2,buckectsize) bits, if the
hashcode itself is not random enough, the indexing algorism will
surely make a bad index. So I've made two hypothesis above. The more
random the hashcode is, the better the shifting-algorism works. On the
other hand, as the lower bits are important, if the last bit is
wasted, it does harm to this shifting-algorism.
     At the first I only check IBMVME, finding it always ends with a
even number, which means the the last bit of default hashcode of an
object is always '0', as a result the last bit is useless for hashing.
And by checking DRLVM, I find it similar,  ends with 2 bits as '00',
so we waste 2 bits here.
     Hashcode is used for hash, as a result,  I believe obj_addr >> 2
will improve hashcode randomness at a very low cost. I was thinking if
we take it in the classlib, however we can not tell if the hashcode is
from obj_addr or user define. As a result, we'd better put it in VM.
This cost is worth, for the more random the hoshcode is, the better
hash-related classes performs (That's the reason why we discuss
HashMap indexing algorism, this shifting algorism will properly
consume more time than using hashcode alone in this single method,
however it do improve a lot in the whole)
     Will it do some harm to other part of vm/classlib if we modify
IdentityHashCode in VM (or kernel class java.lang.Object)?  If not, I
guess we'd better modify IdentityHashCode to improve performance.

2007/7/4, Xiao-Feng Li <xiaofeng.li@gmail.com>:
> On 7/4/07, Jimmy,Jing Lv <firepure@gmail.com> wrote:
> > Hi All,
> >
> >
> >     To sum up, 1) I believe Sergey's idea do helps a lot in
> > hash-related classes, which requires VM guys to do some modification
> > in IdentityHashCode (a simple guess, object address >> 2 ?)
> Jimmy, do you suggest to use obj_addr >> 2 or a more complicated
> hashing in VM? I am not sure if this >>2 really helps if the bucket
> indexing does bit-shift or more randomization itself.
> (Note if we use a formula for hashing, it will be computed every time
> it's accessed, we need consider the overhead of the computation. >>2
> is very cheap, but as said above, I doubt about its effectiveness.)
> Thanks,
> xiaofeng
> >
> --
> http://xiao-feng.blogspot.com


Best Regards!

Jimmy, Jing Lv
China Software Development Lab, IBM

View raw message