Properly our function is not good enough, perhaps we may find some
operater other than '' and '^', e.g, '+', '' or something? Though I
guess '' and '^' are the fastest.
2007/6/29, Yuri Dolgov <dolgov.g.yuri@gmail.com>:
> Hi Jimmy,
>
> I've done couple of experiments with different hash functions (including all
> the functions proposed in this mail thread), but none of them worked
> satisfactorily for hashmap with small number of elemts (3264) when I've
> changed only upper bits. Do you have any progress so far?
>
>
> Thanks,
> Yuri
>
> On 6/28/07, Jimmy,Jing Lv <firepure@gmail.com> wrote:
> >
> > 2007/6/28, Tim Ellison <t.p.ellison@gmail.com>:
> > > Jimmy,Jing Lv wrote:
> > > > 2007/6/28, Tim Ellison <t.p.ellison@gmail.com>:
> > > >> Jimmy,Jing Lv wrote:
> > > >> > One thought is rightshift the hashcode and combine bits together
> > (by
> > > >> > AND or OR or Exclusive Or operation) into low order(e.g, (a &
(a >>
> > > >> > 8) ^ (a >> 16)  (a >> 24)) & (bucketSize1),
a small test on it
> > > >> > shows that it does distributed more randomly, but it may be better
> > > >> > algorithms.
> > > >>
> > > >> Why so complex bit shifting? How did you settle on this?
> > > >>
> > > >> If you are optimizing for the identity hashcode case then you can
> > > >> probably ignore the bottom 'y' bits due to object alignment (and top
> > 'z'
> > > >> bits for address range limits), so just do (a >> 8) & (bucketSize

> > 1)
> > > >> assuming the objects are 8bit aligned.
> > > >
> > > > It sounds interesting and reasonable :)
> > > > But I'm afraid not all hashcode() returns the object's memory
> > > > address,
> > >
> > > Understood. The hashCode() can return any int value, so in principle
> > > you have a random set of 32 bits an no bit manipulation is required.
> > >
> > > However, in practice I think we can optimize for typical hashcode
> > > values. The debate is what is 'typical'? What is the most common type
> > > of HashMap key? (then we can look at it's hashCode implementation).
> > >
> >
> > Ah yes I hope someone may know what is 'typical' or common hashcode? :)
> > But i'm afraid everyone who program in java may define his own
> > hashcode. After all, the basic principle of hashcode is to make it as
> > dispersive as possible, that's why I think we may use all 32 bits to
> > find the bucket.
> >
> > > > in some other cases, the difference of hashcodes may remain
> > > > in buttom 'y' bits or top 'z' bits, at that time our HashMap will
> > > > perform very slow.
> > >
> > > You are right, I should not have thrown away those top bits. So how
> > > about a *rotate* right by eight bits operation?
> > >
> > > > So the best way is to use all 32 bit of a hashcode, and take extra
> > > > operator '' and '^' does not cost much time.( though I'm not sure if
> > > > the the rightshift paramter '8' is proper)
> > >
> > > The operation you suggested was
> > > a & (a >> 8) ^ (a >> 16)  (a >> 24)
> > >
> > > can you please explain how you came up with this?
> > >
> >
> > If we make it small, like 4, we have to spend much more time like:
> > a & (a >> 4) ^ (a >> 8)  (a >> 12)^(a >> 16) ^ (a >>
20)  (a >> 24)...
> > and if make it bigger, we may still lose the information after the
> > operation &(2^n1) if the parameter is a bit bigger than n, so I'm try
> > to find a balance here, though the parameter '8' may be not best.
> > As you know, the basic bucket of HashMap is 16( that is 2^4), and I
> > have no idea what is the most commen size of the hashMap bucket? Or we
> > may still use the size of the bucket as the parameter? As a result:
> > a & (a >> n) ^ (a >> n*2)  (a >> n*3), here bucket size is
2^n
> >
> > I have no idea what is better, I'll try to benckmark and find some
> > hash theory if any. Thanks to Yuri, I take a glance of two papers he
> > offered, though one is copyrighted, the other offered many classic
> > hash algorithm, I may study and find out a best way by myself (of
> > cource, avoid copying the code :) )
> >
> > > Regards,
> > > Tim
> > >
> >
> >
> > 
> >
> > Best Regards!
> >
> > Jimmy, Jing Lv
> > China Software Development Lab, IBM
> >
>

Best Regards!
Jimmy, Jing Lv
China Software Development Lab, IBM
