Sorry  I didn't explain fully. I intended that my code would cache the
Object.hashCode() rather than recomputing the hashCode each time.
On Mon, Jul 14, 2008 at 11:25 AM, Aleksey Shipilev <
aleksey.shipilev@gmail.com> wrote:
> I think that's bad for performance. Using Object.hashCode() leads to
> System.identityHashCode(), which is handled by VM. The exact mechanism
> is VMdependent, but at least on Harmony it's pretty slow. If you want
> to mark each instance as notidentical to another (warning here, you
> may break something), then I suggest to use static increment, e.g.:
>
> private static int hashcodeBase = 1;
> private int cachedHashCode = hashcodeBase++;
>
> This may end up with cache collisions if objects are created from
> different threads, but that's not the case for now.
>
> Anyway, Andrew, your code lacks caching again ;)
>
> Thanks,
> Aleksey.
>
> On Mon, Jul 14, 2008 at 9:51 PM, Andrew Cornwall
> <andrew.pack200@gmail.com> wrote:
> > And in fact, I just tried the following (which makes even more sense):
> >
> >  add objectHashCode() to ClassFileEntry:
> > protected int objectHashCode() {
> > return super.hashCode();
> > }
> >
> >  change generateHashCode in ByteCode to return objectHashCode()
> > private int generateHashCode() {
> > hashcodeComputed = true;
> > return objectHashCode();
> > }
> >
> > Since ByteCodes are equal if and only if they are identical, this seems
> to
> > be the right thing to do. What do you think?
> >
> >
> > On Mon, Jul 14, 2008 at 10:44 AM, Andrew Cornwall <
> andrew.pack200@gmail.com>
> > wrote:
> >
> >> I applied Aleksey's changes, and they look pretty good. I disagree with
> >> Sian to some degree about ByteCode. On my VM (which isn't Harmony), a
> new
> >> empty array's hashCode() is dependent on the array's location in memory,
> and
> >> not the array's contents. In other words:
> >>
> >> int[] x = new int[3];
> >> System.out.println(x.hashCode());
> >> x[1] = 5;
> >> System.out.println(x.hashCode());
> >>
> >> prints the same value for in both cases. rewrite.hashCode() is a handy
> (if
> >> lazy) way to distinguish among different instances.
> >>
> >> If we take rewrite out of hashCode(), HashMap and HashSet get really
> slow 
> >> essentially a linear search among all the ByteCodes of the same form.
> This
> >> brings my test case from 39 seconds up to 1:02.
> >>
> >> Perhaps the right thing to do is to give each unique instance of
> ByteCode
> >> an integer ID which is used in creating the hashCode rather than relying
> on
> >> rewrite to give us uniqueness?
> >>
> >>
> >> On Mon, Jul 14, 2008 at 7:19 AM, Sian January <
> sianjanuary@googlemail.com>
> >> wrote:
> >>
> >>> Ok  we'll wait and see what Andrew says. The only one that I'm not
> happy
> >>> with is Bytecode.hashCode, because rewrite always seems to be an empty
> >>> array
> >>> at the point when generateHashCode is called so it's a bit misleading
> >>> using
> >>> it. I think it should be ok to just remove that line, and still cache
> the
> >>> hashCode.
> >>>
> >>>
> >>> On 14/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com> wrote:
> >>> >
> >>> > Sian,
> >>> >
> >>> > Actually I had tried to extend Andrew's approach to these classes
> >>> > first, but somehow I caught the degradation, that leaved me no choice
> >>> > except the lazy initialization. My concern is, the
> >>> > constructorinitialized hashcode is really depend on usage pattern
> for
> >>> > each specific class, while lazy initialization has more guarantees
to
> >>> > be performancestable. Moreover, I suspect the lazy initialization
> can
> >>> > degrade performance much less because the only overhead it causes is
> >>> > checking the value of boolean field. On the other hand, the
> >>> > constructor initialization may degrade performance a lot since the
> >>> > generation of hashCode is expensive.
> >>> >
> >>> > I can recheck which classes favor lazy initialization and which are
> >>> > not, but I think it's not valuable in terms of efficiency. I mean
> here
> >>> > that the boost connected with changing lazy initialization to
> >>> > constructor one is much lower than boost from caching hashcode
> anyway.
> >>> >
> >>> > Can we accept the patch in this form and revisit this difference
> later?
> >>> > It would be better to focus on more profitable areas for improvements
> >>> for
> >>> > now.
> >>> >
> >>> > P.S. I had asked Andrew to recheck whether my patch works as fast as
> >>> > his, also to check lazy initialization approach. On my tests the
> boost
> >>> > is stable and good.
> >>> >
> >>> > Thanks,
> >>> > Aleksey.
> >>> >
> >>> > On Mon, Jul 14, 2008 at 5:25 PM, Sian January
> >>> > <sianjanuary@googlemail.com> wrote:
> >>> > > Hi Aleksey,
> >>> > >
> >>> > > It's a really good idea to extend this patch to cover some more
> >>> classes,
> >>> > but
> >>> > > I think Andrew's method is faster (a final cachedHashCode field
> that
> >>> is
> >>> > > initialized in the constructor). The only reason I see to do
it
> later
> >>> > would
> >>> > > be if we thought some of these objects never had hashCode called
on
> >>> them,
> >>> > > but I don't think that's the case. Would you be able to try that
> >>> method
> >>> > > instead in your patch and see if I'm right about it being faster?
> >>> > >
> >>> > > Thanks,
> >>> > >
> >>> > > Sian
> >>> > >
> >>> > >
> >>> > >
> >>> > > On 12/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com>
> wrote:
> >>> > >>
> >>> > >> Andrew,
> >>> > >>
> >>> > >> I had attached the patch to HARMONY5907, covering several
first
> >>> > >> methods. Can you confirm this patch helps for your scenario?
> >>> > >>
> >>> > >> Thanks,
> >>> > >> Aleksey.
> >>> > >>
> >>> > >> On Sat, Jul 12, 2008 at 1:58 PM, Aleksey Shipilev
> >>> > >> <aleksey.shipilev@gmail.com> wrote:
> >>> > >> > And the sorted list:
> >>> > >> >
> >>> > >> > 95462388 bc.cputf8
> >>> > >> > 18646908 bc.bytecode
> >>> > >> > 15118425 bc.cpclass
> >>> > >> > 14928914 bc.cpnametype
> >>> > >> > 12103799 bc.cpmethref
> >>> > >> > 5159994 bc.cpfieldref
> >>> > >> > 3420605 bc.methref
> >>> > >> > 1840965 bc.cpstring
> >>> > >> > 839916 bc.codeattr
> >>> > >> > 839916 bc.locvarattr
> >>> > >> > 839916 bc.linenumattr
> >>> > >> > 430234 bc.cpmethod
> >>> > >> > 277144 bc.cpfield
> >>> > >> > 263753 bc.attr
> >>> > >> > 153811 bc.cpinteger
> >>> > >> > 121856 bc.newattr
> >>> > >> > 93471 bc.cvalattr
> >>> > >> > 72492 bc.excpattr
> >>> > >> > 57428 bc.srcfileattr
> >>> > >> > 57428 bc.srcfileattr
> >>> > >> > 48104 bc.cplong
> >>> > >> > 40362 bc.innerclass
> >>> > >> > 5593 bc.depattr
> >>> > >> > 3255 bc.cpfloat
> >>> > >> > 1638 bc.cpdouble
> >>> > >> > 532 attrlayout
> >>> > >> > 0 archive
> >>> > >> > 0 attrdef
> >>> > >> > 0 newattrband
> >>> > >> > 0 bc.anndefarg
> >>> > >> > 0 bc.rtannattr
> >>> > >> > 0 classbands
> >>> > >> > 0 filebands
> >>> > >> > 0 metabandgr
> >>> > >> > 0 segheader
> >>> > >> > 0 bc.remattr
> >>> > >> > 0 bc.annattr
> >>> > >> > 0 bc.cpconst
> >>> > >> > 0 bc.cpmember
> >>> > >> > 0 bc.signattr
> >>> > >> > 0 bandset
> >>> > >> > 0 bcbands
> >>> > >> > 0 cpbands
> >>> > >> > 0 icbands
> >>> > >> > 0 ictuple
> >>> > >> > 0 segment
> >>> > >> > 0 segopts
> >>> > >> > 0 bc.classf
> >>> > >> > 0 bc.cpref
> >>> > >> > 0 bc.opmgr
> >>> > >> > 0 bc.rtattr
> >>> > >> > 0 segcp
> >>> > >> > 0 bc.ccp
> >>> > >> > 0 attrlayoutmap
> >>> > >> > 0 bc.encmethattr
> >>> > >> > 0 bc.exptableent
> >>> > >> > 0 bc.locvartable
> >>> > >> > 0 bc.signattr
> >>> > >> >
> >>> > >> > Thanks,
> >>> > >> > Aleksey.
> >>> > >> >
> >>> > >> > On Sat, Jul 12, 2008 at 1:50 PM, Aleksey Shipilev
> >>> > >> > <aleksey.shipilev@gmail.com> wrote:
> >>> > >> >> Hi, Andrew!
> >>> > >> >>
> >>> > >> >> I had updated the internal profiler to support hashCode()
> probes
> >>> [1],
> >>> > >> >> to extend your effort in hashcode optimization. There
are bunch
> of
> >>> > >> >> heavily used hashcodes, most of them are going to
> >>> Object.hashCode()
> >>> > >> >> and then to System.identityHashCode(). We can cache/implement
> >>> > hashcode
> >>> > >> >> for these classes. Here's the profile:
> >>> > >> >>
> >>> > >> >> Hashcodes:
> >>> > >> >> archive: 0
> >>> > >> >> attrdef: 0
> >>> > >> >> attrlayout: 532
> >>> > >> >> attrlayoutmap: 0
> >>> > >> >> bandset: 0
> >>> > >> >> bcbands: 0
> >>> > >> >> classbands: 0
> >>> > >> >> cpbands: 0
> >>> > >> >> filebands: 0
> >>> > >> >> icbands: 0
> >>> > >> >> ictuple: 0
> >>> > >> >> metabandgr: 0
> >>> > >> >> newattrband: 0
> >>> > >> >> segcp: 0
> >>> > >> >> segheader: 0
> >>> > >> >> segment: 0
> >>> > >> >> segopts: 0
> >>> > >> >> bc.attr: 263753
> >>> > >> >> bc.remattr: 0
> >>> > >> >> bc.anndefarg: 0
> >>> > >> >> bc.annattr: 0
> >>> > >> >> bc.bytecode: 18646908
> >>> > >> >> bc.ccp: 0
> >>> > >> >> bc.classf: 0
> >>> > >> >> bc.codeattr: 839916
> >>> > >> >> bc.cvalattr: 93471
> >>> > >> >> bc.cpclass: 15118425
> >>> > >> >> bc.cpconst: 0
> >>> > >> >> bc.cpdouble: 1638
> >>> > >> >> bc.cpfield: 277144
> >>> > >> >> bc.cpfieldref: 5159994
> >>> > >> >> bc.cpfloat: 3255
> >>> > >> >> bc.cpinteger: 153811
> >>> > >> >> bc.methref: 3420605
> >>> > >> >> bc.cplong: 48104
> >>> > >> >> bc.cpmember: 0
> >>> > >> >> bc.cpmethod: 430234
> >>> > >> >> bc.cpmethref: 12103799
> >>> > >> >> bc.cpnametype: 14928914
> >>> > >> >> bc.cpref: 0
> >>> > >> >> bc.cpstring: 1840965
> >>> > >> >> bc.cputf8: 95462388
> >>> > >> >> bc.depattr: 5593
> >>> > >> >> bc.encmethattr: 0
> >>> > >> >> bc.excpattr: 72492
> >>> > >> >> bc.exptableent: 0
> >>> > >> >> bc.innerclass: 40362
> >>> > >> >> bc.linenumattr: 839916
> >>> > >> >> bc.locvarattr: 839916
> >>> > >> >> bc.locvartable: 0
> >>> > >> >> bc.newattr: 121856
> >>> > >> >> bc.opmgr: 0
> >>> > >> >> bc.rtattr: 0
> >>> > >> >> bc.rtannattr: 0
> >>> > >> >> bc.signattr: 0
> >>> > >> >> bc.srcfileattr: 57428
> >>> > >> >>
> >>> > >> >> Would you like to produce the patch?
> >>> > >> >> I think it would be funny :)
> >>> > >> >>
> >>> > >> >> Thanks,
> >>> > >> >> Aleksey.
> >>> > >> >>
> >>> > >> >> [1] https://issues.apache.org/jira/browse/HARMONY5905
> >>> > >> >>
> >>> > >> >> On Sat, Jul 12, 2008 at 12:48 AM, Andrew Cornwall
(JIRA)
> >>> > >> >> <jira@apache.org> wrote:
> >>> > >> >>>
> >>> > >> >>> [
> >>> > >>
> >>> >
> >>>
> https://issues.apache.org/jira/browse/HARMONY5907?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
> >>> > ]
> >>> > >> >>>
> >>> > >> >>> Andrew Cornwall updated HARMONY5907:
> >>> > >> >>> 
> >>> > >> >>>
> >>> > >> >>> Attachment: main.patch
> >>> > >> >>>
> >>> > >> >>> main.patch includes change to CPUTF8.java
> >>> > >> >>>
> >>> > >> >>>
> >>> > >> >>>> [classlib][pack200]CPUTF8.hashCode() is slow
> >>> > >> >>>> 
> >>> > >> >>>>
> >>> > >> >>>> Key: HARMONY5907
> >>> > >> >>>> URL:
> >>> > >> https://issues.apache.org/jira/browse/HARMONY5907
> >>> > >> >>>> Project: Harmony
> >>> > >> >>>> Issue Type: Improvement
> >>> > >> >>>> Affects Versions: 5.0M6
> >>> > >> >>>> Environment: Latest pack200
> >>> > >> >>>> Reporter: Andrew Cornwall
> >>> > >> >>>> Attachments: main.patch
> >>> > >> >>>>
> >>> > >> >>>>
> >>> > >> >>>> The unpack process spends a lot of time doing
> CPUTF8.hashCode()
> >>> 
> >>> > >> which does String.hashCode(). We can save about 1.5 seconds
of my
> 39
> >>> > second
> >>> > >> test case (about 4%) by caching the hashCode. (I thought we
did
> this
> >>> > before
> >>> > >>  or maybe I dreamt it?)
> >>> > >> >>>
> >>> > >> >>> 
> >>> > >> >>> This message is automatically generated by JIRA.
> >>> > >> >>> 
> >>> > >> >>> You can reply to this email to add a comment
to the issue
> online.
> >>> > >> >>>
> >>> > >> >>>
> >>> > >> >>
> >>> > >> >
> >>> > >>
> >>> > >
> >>> > >
> >>> > >
> >>> > > 
> >>> > > Unless stated otherwise above:
> >>> > > IBM United Kingdom Limited  Registered in England and Wales with
> >>> number
> >>> > > 741598.
> >>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> PO6
> >>> > 3AU
> >>> > >
> >>> >
> >>>
> >>>
> >>>
> >>> 
> >>> Unless stated otherwise above:
> >>> IBM United Kingdom Limited  Registered in England and Wales with
> number
> >>> 741598.
> >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
> >>>
> >>
> >>
> >
>
