harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Beyer" <ndbe...@apache.org>
Subject Re: [classlib][luni][performance] Improvements in Collections
Date Fri, 18 Apr 2008 19:06:26 GMT
On Fri, Apr 18, 2008 at 1:53 PM, Sergey Salishev <
sergey.i.salishev@gmail.com> wrote:

> Nathan,
>
> The "load factor" does't impact the capacity growth scale. It only governs
> the persentage of the free space reserved inside the current capacity. So
> if
> the initial capacity is X the capacity growth scale would be 2X 4X 8X...
> with all load factor values.


Correct, it determines when the capacity should increase.


>
>
> The changes proposed by Alexey for WeakHashMap are already applied to
> HashMap almost a year ago.


That is not what I understand. The majority of the patch has nothing to do
with the capacity calculation. Additionally, it's not the same; the method
names are different, the algorithm is slightly different, etc. If this
capacity calculation is good for all three HashMaps, then put the exact same
code in all three HashMaps. I'll vote for consistency and clarity
(java.util.HashMap's 'calculateCapacity' method is much clearer).

-Nathan


>
>
> Thanks.
> Sergey.
>
> On Fri, Apr 18, 2008 at 10:41 PM, Nathan Beyer <ndbeyer@apache.org<https://mail.google.com/mail?view=cm&tf=0&to=ndbeyer@apache.org>>
> wrote:
>
> > On Fri, Apr 18, 2008 at 12:57 PM, Sergey Salishev <
> > sergey.i.salishev@gmail.com<https://mail.google.com/mail?view=cm&tf=0&to=sergey.i.salishev@gmail.com>>
> wrote:
> >
> > > Nathan
> > >
> > > I agree with  you. This can be a problem. In my practice the biggest
> > > problems were with 1-2 sized hash maps taking 16 size arrays. Probably
> > the
> > > best place to implement % optimization would bi CPU:)
> > > We can try the different approach. The HashMap growth is governed only
> > by
> > > initial capacity so in the HashMap constructor one can say is 2^k or
> > not.
> >
> >
> > What about 'load factor'? That governs the growth as well.
> >
> > -Nathan
> >
> >
> > >
> > >
> > > We can write something like
> > >
> > > <code>
> > > final boolean powOf2 = isPowOf2(capacity);
> > >
> > > final int mod(int index, int length) {
> > >     return powOf2? index & (length - 1) : index % length;
> > > }
> > > <code>
> > >
> > > In theory the common path for the workload should be specialized by
> JIT.
> > >
> > > Ooops. When looking into HashMap code I've noticed someone already did
> > > this.
> > > https://issues.apache.org/jira/browse/HARMONY-4064
> > > HashMap effectively does the rounding to the neares 2^k.
> > >
> > > Thanks.
> > > Sergey.
> > > On Fri, Apr 18, 2008 at 8:55 PM, Nathan Beyer <ndbeyer@apache.org<https://mail.google.com/mail?view=cm&tf=0&to=ndbeyer@apache.org>
> <
> > https://mail.google.com/mail?view=cm&tf=0&to=ndbeyer@apache.org>>
> > > wrote:
> > >
> > > > I know that people will notice; I have personal experience with
> > systems
> > > > that
> > > > included custom Map implementations were written because HashMaps
> grew
> > > too
> > > > large for small data sets (less than 2000, actually) and wasted a
> lot
> > of
> > > > memory for the unnecessary capacity. Even the use of the capacity
> and
> > > load
> > > > factor didn't provide enough compensation in these cases.
> > > >
> > > > -Nathan
> > > >
> > > > On Fri, Apr 18, 2008 at 11:46 AM, Sergey Salishev <
> > > > sergey.i.salishev@gmail.com<https://mail.google.com/mail?view=cm&tf=0&to=sergey.i.salishev@gmail.com>
> <
> > https://mail.google.com/mail?view=cm&tf=0&to=sergey.i.salishev@gmail.com
> >>
> >  > wrote:
> > > >
> > > > > Nathan,
> > > > >
> > > > > I don't think anyone will notice the hash map size rounding. It
> can
> > > lead
> > > > > to
> > > > > some memory overhead in very rare cases the user creates hash map
> > with
> > > > the
> > > > > exact size. But in the most common case where the map is created
> > with
> > > > > default size the rounding will not change the behavior at all as
> > it's
> > > in
> > > > > agreement with the standard 2x growth policy. On the other hand
> size
> > > > > rounding gives substantial performance boost on all gets.
> > > > >
> > > > > Thanks.
> > > > > Sergey.
> > > > >
> > > > >
> > > > > On Fri, Apr 18, 2008 at 8:13 PM, Nathan Beyer <ndbeyer@apache.org<https://mail.google.com/mail?view=cm&tf=0&to=ndbeyer@apache.org>
> <
> > https://mail.google.com/mail?view=cm&tf=0&to=ndbeyer@apache.org>
> > > <
> > > > https://mail.google.com/mail?view=cm&tf=0&to=ndbeyer@apache.org>>
> > > > > wrote:
> > > > >
> > > > > > https://issues.apache.org/jira/browse/HARMONY-5718
> > > > > >
> > > > > > Again, I don't agree with the capacity rounding in the patch
> > > attached
> > > > to
> > > > > > this issue. I do like the change to the internal data structure;
> > use
> > > > two
> > > > > > arrays for key/value instead a single array. It makes the code
> > > easier
> > > > to
> > > > > > read.
> > > > > >
> > > > > > -Nathan
> > > > > >
> > > > > > On Fri, Apr 18, 2008 at 1:50 AM, Aleksey Shipilev <
> > > > > > aleksey.shipilev@gmail.com<https://mail.google.com/mail?view=cm&tf=0&to=aleksey.shipilev@gmail.com>
> <
> > https://mail.google.com/mail?view=cm&tf=0&to=aleksey.shipilev@gmail.com>
> > > <
> > > >
> > https://mail.google.com/mail?view=cm&tf=0&to=aleksey.shipilev@gmail.com
> > > >>
> > > >  > wrote:
> > > > > >
> > > > > >  > Colleagues,
> > > > > > >
> > > > > > > I had recently filed two JIRAs with improvements in
> Collections,
> > > > > > > giving up to +30-40% to serialization benchmarks. Presumably
> > they
> > > > will
> > > > > > > boost the performance across the all users since the
> > optimization
> > > is
> > > > > > > pretty general:
> > > > > > > https://issues.apache.org/jira/browse/HARMONY-5761
> > > > > > > https://issues.apache.org/jira/browse/HARMONY-5718
> > > > > > >
> > > > > > > Would some classlib guru (Tim, Nathan, Tony?) review and
> commit
> > > > them?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Aleksey.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message