harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy,Jing Lv" <firep...@gmail.com>
Subject Re: [classlib] Some further study on hashcode and HashMap(Was Re: Performance improvement of java.util.HashMap)
Date Wed, 04 Jul 2007 08:59:31 GMT
2007/7/4, Xiao-Feng Li <xiaofeng.li@gmail.com>:
> On 7/4/07, Jimmy,Jing Lv <firepure@gmail.com> wrote:
> > Hi XiaoFeng,
> >
> >      Ah, I may be explain myself very well here :) The bit shifting

Oops! I mean I may 'NOT' (instead of 'be') explain myself in the first mail :)

> > algorism is NOT used for randomization itself, but for trying to
> > gather all infomation of 32 bits into log(2,buckectsize) bits, if the
> > hashcode itself is not random enough, the indexing algorism will
> > surely make a bad index. So I've made two hypothesis above. The more
> > random the hashcode is, the better the shifting-algorism works. On the
> > other hand, as the lower bits are important, if the last bit is
> > wasted, it does harm to this shifting-algorism.
> >      At the first I only check IBMVME, finding it always ends with a
> > even number, which means the the last bit of default hashcode of an
> > object is always '0', as a result the last bit is useless for hashing.
> > And by checking DRLVM, I find it similar,  ends with 2 bits as '00',
> > so we waste 2 bits here.
> >      Hashcode is used for hash, as a result,  I believe obj_addr >> 2
> > will improve hashcode randomness at a very low cost. I was thinking if
> > we take it in the classlib, however we can not tell if the hashcode is
> > from obj_addr or user define. As a result, we'd better put it in VM.
> > This cost is worth, for the more random the hoshcode is, the better
> > hash-related classes performs (That's the reason why we discuss
> > HashMap indexing algorism, this shifting algorism will properly
> > consume more time than using hashcode alone in this single method,
> > however it do improve a lot in the whole)
> >      Will it do some harm to other part of vm/classlib if we modify
> > IdentityHashCode in VM (or kernel class java.lang.Object)?  If not, I
> > guess we'd better modify IdentityHashCode to improve performance.
>
> Thanks for the explanation. No, I don't think this >>2 costs anything
> if implemented in VM. Probably you are right the >>2 helps to remove
> some useless info.
>
> Well, maybe I should read some random number theory, since I think
> although >>2  removes the meaningless bits in LSB, it introduces two
> meaningless bits in MSB. If you treat the 32 bit equally when encoding
> their info into log(num_bucket) bits, the bits in LSB and MSB give you
> the same effect. Or you assume the lower bits are more significant by
> default?
>

Yes,  the second hypothesis I made here is that the lower bits are
more important than higher bits, we focus on lower bits more than
higher bits (though not throw higher bits away) however it is
imposible for them to be equal in indexing, especially in small
buckets. We choose lower bits for most of time hashcodes alter in
lower bits. And move meaningless bits to MSB here fits this algorithm
:)

> Thanks,
> xiaofeng
>
>
> > 2007/7/4, Xiao-Feng Li <xiaofeng.li@gmail.com>:
> > > On 7/4/07, Jimmy,Jing Lv <firepure@gmail.com> wrote:
> > > > Hi All,
> > > >
> > <SNIP>
> > > >
> > > >     To sum up, 1) I believe Sergey's idea do helps a lot in
> > > > hash-related classes, which requires VM guys to do some modification
> > > > in IdentityHashCode (a simple guess, object address >> 2 ?)
> > >
> > > Jimmy, do you suggest to use obj_addr >> 2 or a more complicated
> > > hashing in VM? I am not sure if this >>2 really helps if the bucket
> > > indexing does bit-shift or more randomization itself.
> > >
> > > (Note if we use a formula for hashing, it will be computed every time
> > > it's accessed, we need consider the overhead of the computation. >>2
> > > is very cheap, but as said above, I doubt about its effectiveness.)
> > >
> > > Thanks,
> > > xiaofeng
> > >
> > <SNIP>
> > > >
> > >
> > >
> > > --
> > > http://xiao-feng.blogspot.com
> > >
> >
> >
> > --
> >
> > Best Regards!
> >
> > Jimmy, Jing Lv
> > China Software Development Lab, IBM
> >
>
>
> --
> http://xiao-feng.blogspot.com
>


-- 

Best Regards!

Jimmy, Jing Lv
China Software Development Lab, IBM

Mime
View raw message