And in fact, I just tried the following (which makes even more sense):
 add objectHashCode() to ClassFileEntry:
protected int objectHashCode() {
return super.hashCode();
}
 change generateHashCode in ByteCode to return objectHashCode()
private int generateHashCode() {
hashcodeComputed = true;
return objectHashCode();
}
Since ByteCodes are equal if and only if they are identical, this seems to
be the right thing to do. What do you think?
On Mon, Jul 14, 2008 at 10:44 AM, Andrew Cornwall <andrew.pack200@gmail.com>
wrote:
> I applied Aleksey's changes, and they look pretty good. I disagree with
> Sian to some degree about ByteCode. On my VM (which isn't Harmony), a new
> empty array's hashCode() is dependent on the array's location in memory, and
> not the array's contents. In other words:
>
> int[] x = new int[3];
> System.out.println(x.hashCode());
> x[1] = 5;
> System.out.println(x.hashCode());
>
> prints the same value for in both cases. rewrite.hashCode() is a handy (if
> lazy) way to distinguish among different instances.
>
> If we take rewrite out of hashCode(), HashMap and HashSet get really slow 
> essentially a linear search among all the ByteCodes of the same form. This
> brings my test case from 39 seconds up to 1:02.
>
> Perhaps the right thing to do is to give each unique instance of ByteCode
> an integer ID which is used in creating the hashCode rather than relying on
> rewrite to give us uniqueness?
>
>
> On Mon, Jul 14, 2008 at 7:19 AM, Sian January <sianjanuary@googlemail.com>
> wrote:
>
>> Ok  we'll wait and see what Andrew says. The only one that I'm not happy
>> with is Bytecode.hashCode, because rewrite always seems to be an empty
>> array
>> at the point when generateHashCode is called so it's a bit misleading
>> using
>> it. I think it should be ok to just remove that line, and still cache the
>> hashCode.
>>
>>
>> On 14/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com> wrote:
>> >
>> > Sian,
>> >
>> > Actually I had tried to extend Andrew's approach to these classes
>> > first, but somehow I caught the degradation, that leaved me no choice
>> > except the lazy initialization. My concern is, the
>> > constructorinitialized hashcode is really depend on usage pattern for
>> > each specific class, while lazy initialization has more guarantees to
>> > be performancestable. Moreover, I suspect the lazy initialization can
>> > degrade performance much less because the only overhead it causes is
>> > checking the value of boolean field. On the other hand, the
>> > constructor initialization may degrade performance a lot since the
>> > generation of hashCode is expensive.
>> >
>> > I can recheck which classes favor lazy initialization and which are
>> > not, but I think it's not valuable in terms of efficiency. I mean here
>> > that the boost connected with changing lazy initialization to
>> > constructor one is much lower than boost from caching hashcode anyway.
>> >
>> > Can we accept the patch in this form and revisit this difference later?
>> > It would be better to focus on more profitable areas for improvements
>> for
>> > now.
>> >
>> > P.S. I had asked Andrew to recheck whether my patch works as fast as
>> > his, also to check lazy initialization approach. On my tests the boost
>> > is stable and good.
>> >
>> > Thanks,
>> > Aleksey.
>> >
>> > On Mon, Jul 14, 2008 at 5:25 PM, Sian January
>> > <sianjanuary@googlemail.com> wrote:
>> > > Hi Aleksey,
>> > >
>> > > It's a really good idea to extend this patch to cover some more
>> classes,
>> > but
>> > > I think Andrew's method is faster (a final cachedHashCode field that
>> is
>> > > initialized in the constructor). The only reason I see to do it later
>> > would
>> > > be if we thought some of these objects never had hashCode called on
>> them,
>> > > but I don't think that's the case. Would you be able to try that
>> method
>> > > instead in your patch and see if I'm right about it being faster?
>> > >
>> > > Thanks,
>> > >
>> > > Sian
>> > >
>> > >
>> > >
>> > > On 12/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com> wrote:
>> > >>
>> > >> Andrew,
>> > >>
>> > >> I had attached the patch to HARMONY5907, covering several first
>> > >> methods. Can you confirm this patch helps for your scenario?
>> > >>
>> > >> Thanks,
>> > >> Aleksey.
>> > >>
>> > >> On Sat, Jul 12, 2008 at 1:58 PM, Aleksey Shipilev
>> > >> <aleksey.shipilev@gmail.com> wrote:
>> > >> > And the sorted list:
>> > >> >
>> > >> > 95462388 bc.cputf8
>> > >> > 18646908 bc.bytecode
>> > >> > 15118425 bc.cpclass
>> > >> > 14928914 bc.cpnametype
>> > >> > 12103799 bc.cpmethref
>> > >> > 5159994 bc.cpfieldref
>> > >> > 3420605 bc.methref
>> > >> > 1840965 bc.cpstring
>> > >> > 839916 bc.codeattr
>> > >> > 839916 bc.locvarattr
>> > >> > 839916 bc.linenumattr
>> > >> > 430234 bc.cpmethod
>> > >> > 277144 bc.cpfield
>> > >> > 263753 bc.attr
>> > >> > 153811 bc.cpinteger
>> > >> > 121856 bc.newattr
>> > >> > 93471 bc.cvalattr
>> > >> > 72492 bc.excpattr
>> > >> > 57428 bc.srcfileattr
>> > >> > 57428 bc.srcfileattr
>> > >> > 48104 bc.cplong
>> > >> > 40362 bc.innerclass
>> > >> > 5593 bc.depattr
>> > >> > 3255 bc.cpfloat
>> > >> > 1638 bc.cpdouble
>> > >> > 532 attrlayout
>> > >> > 0 archive
>> > >> > 0 attrdef
>> > >> > 0 newattrband
>> > >> > 0 bc.anndefarg
>> > >> > 0 bc.rtannattr
>> > >> > 0 classbands
>> > >> > 0 filebands
>> > >> > 0 metabandgr
>> > >> > 0 segheader
>> > >> > 0 bc.remattr
>> > >> > 0 bc.annattr
>> > >> > 0 bc.cpconst
>> > >> > 0 bc.cpmember
>> > >> > 0 bc.signattr
>> > >> > 0 bandset
>> > >> > 0 bcbands
>> > >> > 0 cpbands
>> > >> > 0 icbands
>> > >> > 0 ictuple
>> > >> > 0 segment
>> > >> > 0 segopts
>> > >> > 0 bc.classf
>> > >> > 0 bc.cpref
>> > >> > 0 bc.opmgr
>> > >> > 0 bc.rtattr
>> > >> > 0 segcp
>> > >> > 0 bc.ccp
>> > >> > 0 attrlayoutmap
>> > >> > 0 bc.encmethattr
>> > >> > 0 bc.exptableent
>> > >> > 0 bc.locvartable
>> > >> > 0 bc.signattr
>> > >> >
>> > >> > Thanks,
>> > >> > Aleksey.
>> > >> >
>> > >> > On Sat, Jul 12, 2008 at 1:50 PM, Aleksey Shipilev
>> > >> > <aleksey.shipilev@gmail.com> wrote:
>> > >> >> Hi, Andrew!
>> > >> >>
>> > >> >> I had updated the internal profiler to support hashCode()
probes
>> [1],
>> > >> >> to extend your effort in hashcode optimization. There are
bunch of
>> > >> >> heavily used hashcodes, most of them are going to
>> Object.hashCode()
>> > >> >> and then to System.identityHashCode(). We can cache/implement
>> > hashcode
>> > >> >> for these classes. Here's the profile:
>> > >> >>
>> > >> >> Hashcodes:
>> > >> >> archive: 0
>> > >> >> attrdef: 0
>> > >> >> attrlayout: 532
>> > >> >> attrlayoutmap: 0
>> > >> >> bandset: 0
>> > >> >> bcbands: 0
>> > >> >> classbands: 0
>> > >> >> cpbands: 0
>> > >> >> filebands: 0
>> > >> >> icbands: 0
>> > >> >> ictuple: 0
>> > >> >> metabandgr: 0
>> > >> >> newattrband: 0
>> > >> >> segcp: 0
>> > >> >> segheader: 0
>> > >> >> segment: 0
>> > >> >> segopts: 0
>> > >> >> bc.attr: 263753
>> > >> >> bc.remattr: 0
>> > >> >> bc.anndefarg: 0
>> > >> >> bc.annattr: 0
>> > >> >> bc.bytecode: 18646908
>> > >> >> bc.ccp: 0
>> > >> >> bc.classf: 0
>> > >> >> bc.codeattr: 839916
>> > >> >> bc.cvalattr: 93471
>> > >> >> bc.cpclass: 15118425
>> > >> >> bc.cpconst: 0
>> > >> >> bc.cpdouble: 1638
>> > >> >> bc.cpfield: 277144
>> > >> >> bc.cpfieldref: 5159994
>> > >> >> bc.cpfloat: 3255
>> > >> >> bc.cpinteger: 153811
>> > >> >> bc.methref: 3420605
>> > >> >> bc.cplong: 48104
>> > >> >> bc.cpmember: 0
>> > >> >> bc.cpmethod: 430234
>> > >> >> bc.cpmethref: 12103799
>> > >> >> bc.cpnametype: 14928914
>> > >> >> bc.cpref: 0
>> > >> >> bc.cpstring: 1840965
>> > >> >> bc.cputf8: 95462388
>> > >> >> bc.depattr: 5593
>> > >> >> bc.encmethattr: 0
>> > >> >> bc.excpattr: 72492
>> > >> >> bc.exptableent: 0
>> > >> >> bc.innerclass: 40362
>> > >> >> bc.linenumattr: 839916
>> > >> >> bc.locvarattr: 839916
>> > >> >> bc.locvartable: 0
>> > >> >> bc.newattr: 121856
>> > >> >> bc.opmgr: 0
>> > >> >> bc.rtattr: 0
>> > >> >> bc.rtannattr: 0
>> > >> >> bc.signattr: 0
>> > >> >> bc.srcfileattr: 57428
>> > >> >>
>> > >> >> Would you like to produce the patch?
>> > >> >> I think it would be funny :)
>> > >> >>
>> > >> >> Thanks,
>> > >> >> Aleksey.
>> > >> >>
>> > >> >> [1] https://issues.apache.org/jira/browse/HARMONY5905
>> > >> >>
>> > >> >> On Sat, Jul 12, 2008 at 12:48 AM, Andrew Cornwall (JIRA)
>> > >> >> <jira@apache.org> wrote:
>> > >> >>>
>> > >> >>> [
>> > >>
>> >
>> https://issues.apache.org/jira/browse/HARMONY5907?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
>> > ]
>> > >> >>>
>> > >> >>> Andrew Cornwall updated HARMONY5907:
>> > >> >>> 
>> > >> >>>
>> > >> >>> Attachment: main.patch
>> > >> >>>
>> > >> >>> main.patch includes change to CPUTF8.java
>> > >> >>>
>> > >> >>>
>> > >> >>>> [classlib][pack200]CPUTF8.hashCode() is slow
>> > >> >>>> 
>> > >> >>>>
>> > >> >>>> Key: HARMONY5907
>> > >> >>>> URL:
>> > >> https://issues.apache.org/jira/browse/HARMONY5907
>> > >> >>>> Project: Harmony
>> > >> >>>> Issue Type: Improvement
>> > >> >>>> Affects Versions: 5.0M6
>> > >> >>>> Environment: Latest pack200
>> > >> >>>> Reporter: Andrew Cornwall
>> > >> >>>> Attachments: main.patch
>> > >> >>>>
>> > >> >>>>
>> > >> >>>> The unpack process spends a lot of time doing CPUTF8.hashCode()
>> 
>> > >> which does String.hashCode(). We can save about 1.5 seconds of my 39
>> > second
>> > >> test case (about 4%) by caching the hashCode. (I thought we did this
>> > before
>> > >>  or maybe I dreamt it?)
>> > >> >>>
>> > >> >>> 
>> > >> >>> This message is automatically generated by JIRA.
>> > >> >>> 
>> > >> >>> You can reply to this email to add a comment to the issue
online.
>> > >> >>>
>> > >> >>>
>> > >> >>
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > 
>> > > Unless stated otherwise above:
>> > > IBM United Kingdom Limited  Registered in England and Wales with
>> number
>> > > 741598.
>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> > 3AU
>> > >
>> >
>>
>>
>>
>> 
>> Unless stated otherwise above:
>> IBM United Kingdom Limited  Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>
>
