harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao-Feng Li" <xiaofeng...@gmail.com>
Subject Re: [classlib] Some further study on hashcode and HashMap(Was Re: Performance improvement of java.util.HashMap)
Date Wed, 04 Jul 2007 06:48:05 GMT
On 7/4/07, Jimmy,Jing Lv <firepure@gmail.com> wrote:
> Hi All,
>
>      I've done some further study on hashcode and HashMap. I find it
> very interesting that RI maybe has done something on IdentityHashCode
> just as Sergey said. According to Tim's micro-benchmark on
> JIRA-4046[1], Harmony classlib+IBMVME shows that only 4 bucket are
> used, each contain 250 elements, while RI used up all 16 buckets and
> each of them contains 62-64 elements. What's more, if we print RI's
> objects's hashcode, it was ended with all kind of numbers (e.g.
> 5b8827,147c1db ...) while Harmony+IBMVME return the exact
> address(not sure) which was ended with even number(e.g.
> 77387738,77787778 ...). I haven't try DRLVM at the time (just like
> IBMVME?)
>     So if we do a little modify on the bucket index calculation with
> Tim's test, e.g,
>     int bucket = hashCode >> 1 & (numBuckets - 1);
>     It shows all bucket are used though some of them get much more
> elements than others. And if we modify as:
>      int bucket = hashCode >> 2 & (numBuckets - 1);
>     It just appears as RI do, every bucket got nearly the same number
> of elements
>     If we apply this (hashCode >> 2 & (numBuckets - 1)) to the HashMap
> bucket index algorism, and do some micro benckmark like put/get/remove
> for 10000 times[2], it will  improve 7% (~28s  to ~26s on my desktop).
>     But it is properly not a good idea for HashMap, for some
> user-defined hashcode will work very bad in this algorism. Some very
> famous benchmark, like Specjbb, will at once reduce 10 ~ 15% in score.
> It may not be a proper algorism for classlib buckets index
> calculation.
>     So it still requires algorism like we discussed before, though it
> was not easy at all. If we agree on use &(bucketsize -1) (because it
> is much faster than % prime), we should find out a way to gather all
> 32-bit information into log(2,bucketsize) bits. Operator '+', '|',
> '^', '>>' are used for this target (they are fast themselves). However
> bucketsize are too small usually (at the beginning it was 16, 4-bit)
> to store all 32-bit differences, we have to lose some information
> after all.
>     Let's first take two hypothesis: 1) hashcode (and user defined
> hashcode) are distributed averagely themselves 2) however the lower
> bits are more important than higher bits(it is imposible for them to
> be equal, especially in small buckets)
>     Under these two hypothesis, we can focus on lower bits. It solved
> Yuri's problem that upper bits changes may not effects on index of the
> buckets, it does not matter(as maybe happen in most cases). And
> however the index-searching algorism can not be too complex, 4 or 5
> operators may be enough to make hashcode more distributed averagely in
> the buckets, and avoid the cost on the searching.
>     After reading some papers and hash algorism (however most of them
> are too complex and time-consuming for us),  currently I have an
> expression here:
>     index = hashcode ? hashcode >> a ? hashcode >> b ( ? hashcode >>
c)
>     where "?" stands for one of operator of '^,+,|,&' and a,b,c stand
> for a number less then 32 , a < b < c (I don't know if last operator
> and c can be omitted).
>     Our task is to find best a,b,c and operators between them to
> improve hashcode performance. Currently the best expression I've find
> is :
>     (hash ^ hash >> 9) + hash >> 23
> or
>     (hash ^ hash >> 9) +( hash >>15 | hash >> 23)
> both of them work quite well on my micro benchmark and improve a
> little in Specjbb(at least not worse than using "hash" it alone :) )
>     But I still believe it can improve a lot.
>
>     To sum up, 1) I believe Sergey's idea do helps a lot in
> hash-related classes, which requires VM guys to do some modification
> in IdentityHashCode (a simple guess, object address >> 2 ?)

Jimmy, do you suggest to use obj_addr >> 2 or a more complicated
hashing in VM? I am not sure if this >>2 really helps if the bucket
indexing does bit-shift or more randomization itself.

(Note if we use a formula for hashing, it will be computed every time
it's accessed, we need consider the overhead of the computation. >>2
is very cheap, but as said above, I doubt about its effectiveness.)

Thanks,
xiaofeng

>      2) for user-define hashcode and HashMap it self, based on
> &(bucketsize -1), we can find some parameter for the expression I list
> above, or find some similar expression to use, so that we can make it
> distributed more averagely in the buckets.
>
> [1].
>         public void testDistribute() {
>             int numBuckets = 16;
>
>             int[] distrib = new int[numBuckets];
>             for (int i = 0; i < distrib.length; i++) {
>                 distrib[i] = 0;
>             }
>
>             for (int i = 0; i < 1000; i++) {
>                 int hashCode = new Object().hashCode();
>                 int bucket = hashCode & (numBuckets - 1);
>                 distrib[bucket] = distrib[bucket] + 1;
>                 System.out.println(Integer.toBinaryString(hashCode));
>             }
>
>             System.out.println("Bucket\tList_length");
>             for (int i = 0; i < distrib.length; i++) {
>                 System.out.println(i + "\t" + distrib[i]);
>             }
>         }
> [2]
>          public void testHashMap() throws Exception {
>             HashMap map = new HashMap();
>             Object[] objs = new Object[10000];
>             for (int i = 0; i < objs.length; i++) {
>                 objs[i] = new Object();
>             }
>             long start = System.currentTimeMillis();
>             for (int j = 0; j < 1000; j++) {
>                 for (int i = 0; i < objs.length; i++) {
>                     map.put(objs[i], i);
>                 }
>                 for (int i = 0; i < objs.length; i++) {
>                     map.get(objs[i]);
>                 }
>                 for (int i = 0; i < objs.length; i++) {
>                     map.remove(objs[i]);
>                 }
>             }
>             System.out.println(System.currentTimeMillis() - start);
>         }
>
> 2007/6/29, Sergey Kuksenko <sergey.kuksenko@gmail.com>:
> > Hi All,
> >
> > I wish to include myself into discussion and express some my thought in this
> > area.
> > I am absolutely agree with the idea of using bitwise operations & (or power
> > of two remainder) instead of  % (arbirary remainder) because of power of two
> > remainder is really faster and was proved by my tests and etc...
> >
> > First of all I wish to note that if HashMap works with user-defined HashCode
> > it doesn't matter which reminder is used: power fo two (&) or prime(or
> > arbitrary) number reminder (%). In case if user defined HashCode we can't
> > predict hash and the hash can be suitable as for & and for %.
> > But prime number reminder has one disadavntages (except speed of operation).
> > When HashMap uses prime number (or some irreglar) we get negative effect of
> > unused part of memory page. Size of memory pages is power of two and it is
> > better to fill it completely. Espesially it is visible on large hash maps.
> > Unfortunately I can privide any numbers at the moment.
> > Thus in my opinion power of two reminder gives boost in time of operation
> > plus benefits from more natural memory usage (layouts etc.)
> >
> > As to default hashCode which is IdentityHashCode.
> > There are two ways here:
> > 1. Add new hashing function (additional hashing) inside HashMap. As was
> > suggested here.
> > 2. Change IdentityHashCode implementation. There are no rules that
> > IdentityHashCode should return exactly address of object. And last zero bits
> > from IdentityHashCode (which caused by object alignment) can be removed
> > inside IdentityHashCode. Other words - to add additional hashing inside
> > IdentityHashCode.
> >
> > The first solution has one advantage - it can be done only in classlib. The
> > second variant should be implemented in all VMs which are based on Harmony's
> > classlib. But this way has other advantages -  Well-formed IdentityHashCode
> > will helps not only for HashMap, but for Hashtable, IdentityHashMap and
> > moreover for all hashing implemented by users.
> > In the second variant we don't add rehashing function into HashMap. It is
> > common practice when advanced Java developers tune their HashCode functions.
> > If we add rehashing function into HashMap we may break their tuning, without
> > knowledge of internal rehashing functions it is impossible to tune HashCode
> > function.
> > From my opinion we should think about the last argument against adding
> > rehashing into HashMap and move rehashing into VM.
> >
> >
> >
> >
> > On 6/27/07, Jimmy,Jing Lv <firepure@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > >    As currently focus on performance, I find this JIRA(4064) and do
> > > some investigation on it.
> > >
> > >    The Performance improvement of java.util.HashMap is about two
> > > aspects: one is hash algorithm that make hashcode randomly distributed
> > > enough, just as Tim's fix/comments in Harmony-4064, the other is about
> > > to make hash algorithm itself perform fast enough. HashMap is so
> > > widely used so that both two aspects may be very important.
> > >
> > >    I strongly agree with Tim that hash algorithm that make hashcode
> > > randomly distributed is very important, for HashMap.put(), get(), etc,
> > > depends on it and it should perform better if the hash algorithm can
> > > make it randomly distributed. On the other hand, the hash algorithm
> > > should perform very fast as every HashMap.put and get method call this
> > > hash algorithm, if it is slow itself, then its benfits on randomly
> > > distribution became meaningless.
> > >
> > >    I do a lot of benchmark and micro-benchmarks,  it shows that (a &
> > > b) is much faster than (a % b). And if b is Math.pow(2,n)-1, the two
> > > operations just get the same result. That's why if we choose
> > > "&(Math.pow(2,n)-1)" (here Math.pow(2,n) is the size of bucket of a
> > > hashmap)to replace "%", by this way we improve performance greatly at
> > > least 2% ~ 3% in some benchmarks . This shows if we avoid slow
> > > operation like mod(%), multiply(*), division(/) and so forth, we can
> > > get performance improvement.
> > >
> > >    Yes, I do think to randomly distrubute the hashcode is very
> > > important and I suggest we can use other machnisms other than mod
> > > prime to make it happen. One thought is right-shift the hashcode and
> > > combine bits together (by AND or OR or Exclusive Or operation) into
> > > low order( e.g, (a & (a >> 8) ^ (a >> 16) | (a >> 24))
&
> > > (bucketSize-1), a small test on it shows that it does distributed more
> > > randomly, but it may be better algorithms. In this way the hashcode
> > > may be more random destributed and performs fast. And in this way we
> > > may keep a balance of a both fast and good hash algorithm.
> > >
> > > Any comments/suggestions? Thanks!
> > >
> > >
> > > 2007/6/6, Tim Ellison (JIRA) <jira@apache.org>:
> > > >
> > > >     [ https://issues.apache.org/jira/browse/HARMONY-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501905
> > > ]
> > > >
> > > > Tim Ellison commented on HARMONY-4064:
> > > > --------------------------------------
> > > >
> > > > Robert.  You patch undoes the change I made to make the length of the
> > > elements array a prime number, so that it is now always a power of two.  The
> > > problem is that identity hash codes are not randomly distributed, they
> > > typically reflect object address alignments which will be factors of two, so
> > > we don't get a good distribution across all the possible buckets.
> > > >
> > > > You can see the effect in this simple 'simulation':
> > > >
> > > > public class HashingTest {
> > > >
> > > >     public static void main(String[] args) {
> > > >         int numBuckets = 16;
> > > >
> > > >         int[] distrib = new int[numBuckets];
> > > >         for (int i = 0; i < distrib.length; i++) {
> > > >             distrib[i] = 0;
> > > >         }
> > > >
> > > >         for (int i = 0; i < 1000; i++) {
> > > >             int hashCode = new Object().hashCode();
> > > >             int bucket = hashCode & (numBuckets - 1);
> > > >             distrib[bucket] = distrib[bucket] + 1;
> > > >         }
> > > >
> > > >         System.out.println("Bucket\tList_length");
> > > >         for (int i = 0; i < distrib.length; i++) {
> > > >             System.out.println(i + "\t" + distrib[i]);
> > > >         }
> > > >     }
> > > > }
> > > >
> > > > I suggest that we have to ensure the elements array length is chosen
> > > more carefully, or that we modify the hashCodes to avoid these collisions.
> > > >
> > > > > [classlib][luni] Performance improvement of java.util.HashMap
> > > > > -------------------------------------------------------------
> > > > >
> > > > >                 Key: HARMONY-4064
> > > > >                 URL:
> > > https://issues.apache.org/jira/browse/HARMONY-4064
> > > > >             Project: Harmony
> > > > >          Issue Type: Improvement
> > > > >          Components: Classlib
> > > > >            Reporter: Robert Hu
> > > > >            Assignee: Tim Ellison
> > > > >         Attachments: HARMONY-4064.diff
> > > > >
> > > > >
> > > > > The performance of HashMap can be improved, the hash method is
> > > improved.
> > > > > By running the following test code, our HashMap can be improved from
> > > 110% time cost (compared by RI) to 90% time cost (compared by RI).
> > > > > public class TestHashMap {
> > > > >     public static void main(String[] args) {
> > > > >         Random ran = new Random();
> > > > >         HashMap map = new HashMap();
> > > > >         int elementNum = 500;
> > > > >         int times = 10000;
> > > > >         int[] rans = new int[elementNum];
> > > > >         for (int i = 0; i < elementNum; i++)
> > > > >             rans[i] = ran.nextInt(Integer.MAX_VALUE );
> > > > >         long start = System.currentTimeMillis();
> > > > >         for (int i = 0; i < elementNum; i++)
> > > > >             map.put(rans[i], "b");
> > > > >         System.out.println (System.currentTimeMillis() - start);
> > > > >         start = System.currentTimeMillis();
> > > > >         for (int i = 0; i < elementNum; i++)
> > > > >             for(int j = 0; j< times; j++){
> > > > >                 map.get(rans[i]);
> > > > >             }
> > > > >         System.out.println(System.currentTimeMillis() - start);
> > > > >     }
> > > > > }
> > > >
> > > > --
> > > > This message is automatically generated by JIRA.
> > > > -
> > > > You can reply to this email to add a comment to the issue online.
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Best Regards!
> > >
> > > Jimmy, Jing Lv
> > > China Software Development Lab, IBM
> > >
> >
> >
> >
> > --
> > Best regards,
> > ---
> > Sergey Kuksenko.
> > Intel Enterprise Solutions Software Division.
> >
>
>
> --
>
> Best Regards!
>
> Jimmy, Jing Lv
> China Software Development Lab, IBM
>


-- 
http://xiao-feng.blogspot.com

Mime
View raw message