harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Cornwall" <andrew.pack...@gmail.com>
Subject Re: [classlib][pack200][performance] Profiling unpacking scenario
Date Tue, 05 Aug 2008 23:22:51 GMT
I don't see any differences in results with your patch [5]=5931 when running
against my big batch of tests. That surprises me, honestly - I think when I
did my test I commented out the "nestedEntries.addAll(byteCodes)" in
CodeAttribute.getNestedClassFileEntries(), and doing that certainly causes
problems.

I guess once the nested entries of the ByteCodes have been figured out and
added to the class pool, the bytecodes themselves are no longer needed...
which is why your patch works and mine didn't.

Cool stuff!

On Tue, Aug 5, 2008 at 3:44 PM, Andrew Cornwall <andrew.pack200@gmail.com>wrote:

> I'm not sure about [5]=5931 - if I remember right, I'd taken out ByteCode
> processing in add(), and that led to problems resolving bytecodes which
> referred to things in the classpool. But that was a while ago; I'm
> interested in trying it and seeing if there are any differences in my wad of
> 40M or so testcases.
>
> The only other patch I've looked at so far has been [2]=5928. That's a
> nifty change.
>
> All in all, it looks like you've done a lot of good work!
>
>
>
> On Tue, Aug 5, 2008 at 3:16 PM, Aleksey Shipilev <
> aleksey.shipilev@gmail.com> wrote:
>
>> Sian, Andrew,
>>
>> An update here. I had updated the profiler [1]  and run it over on JDT
>> unpacking scenario several times. That's what we have today (times are
>> msecs, indentation resembles call hierarchy):
>>
>> Unpack: 220529
>>
>>  segment read:  28853
>>   cpBands:     13791
>>   adBands:     214
>>   icBands:     343
>>   cbBands:     9158
>>   bcBands:     4694
>>   fbBands:     118
>>   fBits:       407
>>
>>  segment parse: 176929
>>   header:      0
>>   cpBands:     0
>>   adBands:     0
>>   icBands:     1
>>   cbBands:     0
>>
>>   bcBands:     77245
>>      exceptn:  656
>>      newCA:    71145
>>       getBC:   16372
>>       extOpnd: 26776
>>       fixup:   1885
>>      methAttr: 591
>>      curAttr:  1808
>>
>>   fbBands:     0
>>
>>   buildCF:     82057
>>     ccp.addN:  36182
>>     ccp.addNW: 433
>>     ccp.resv:  27452
>>     ic.getIC:  11717
>>
>>   cfWrite:     17379
>>
>>  segment write: 14349
>>
>>
>> As you can see in commits, I had filed a couple of JIRAs with the
>> bunch of pack200 optimizations [2,3,4,5], here what I got with all
>> them applied:
>>
>> Unpack: 193165 (-14% in total)
>>
>>  segment read:  28334
>>   cpBands:     13034
>>   adBands:     249
>>   icBands:     341
>>   cbBands:     9299
>>   bcBands:     4645
>>   fbBands:     169
>>   fBits:       459
>>
>>  segment parse: 150032
>>   header:      1
>>   cpBands:     0
>>   adBands:     0
>>   icBands:     82
>>   cbBands:     0
>>
>>   bcBands:     73936
>>      exceptn:  633
>>      newCA:    67871
>>       getBC:   13874    <--- (-18% due to [2])
>>       extOpnd: 26583
>>       fixup:   1808
>>      methAttr: 615
>>      curAttr:  1663
>>
>>   fbBands:     0
>>
>>   buildCF:     58319
>>     ccp.addN:  26199  <---- (-38% due to [5])
>>     ccp.addNW: 424
>>     ccp.resv:  23642   <---- (-16% due to [5])
>>     ic.getIC:  2245    <--- (-80% due to [3,4])
>>
>>   cfWrite:     17463
>>
>>  segment write: 14413
>>
>>
>> Of course, the boosts are diminished with the performance overheads of
>> profiling. But still, this profile gives pretty good insight on what's
>> going on. CodeAttribute ["newCA" is the "new CodeAttribute(...)"] is
>> the next candidate for optimization, I guess.
>>
>> Sian, Andrew, can you please review the patches? I'm particularly
>> interested in [5], because it's proof-of-concept and kind of
>> controversial.
>>
>> Thanks,
>> Aleksey.
>>
>> [1] "classlib][pack200] Internal profiler for pack200"
>> https://issues.apache.org/jira/browse/HARMONY-5905
>>
>> [2] [classlib][pack200][performance] java.util.HashMap usage optimization
>> https://issues.apache.org/jira/browse/HARMONY-5928
>>
>> [3] [classlib][pack200][performance] Segment.computeIcStored rewrite
>> https://issues.apache.org/jira/browse/HARMONY-5929
>>
>> [4] [classlib][pack200][performance] IcBands.getRelevantIcTuples rewrite
>> https://issues.apache.org/jira/browse/HARMONY-5930
>>
>> [5] [classlib][pack200][performance] Some ClassConstantPool content
>> may not be needed
>> https://issues.apache.org/jira/browse/HARMONY-5931
>>
>>
>> On Thu, Jul 10, 2008 at 7:59 PM, Aleksey Shipilev
>> <aleksey.shipilev@gmail.com> wrote:
>> > I had quickly drafted the internal Java profiler for pack200 at [1].
>> > Here are the results of profiling for 50Mb Eclipse JDT jar, times are
>> > microsecs, identation emulates the call tree. Some of the label are
>> > not distinguishable, but you may look up probe positions in the patch.
>> >
>> > Unpack: 38311
>> >  segment unpack: 38217
>> >   parse segment: 11575
>> >     parse header:  0
>> >     parse ADB:     78
>> >     parse bcbands: 5342
>> >       parse1:      453
>> >       parse2:      93
>> >       select:      252
>> >       attrlayout:  0
>> >       methods:     3997
>> >     parse cbands:  3358
>> >       classattr:   1002
>> >       code:        1636
>> >       fields:      173
>> >       methods:     515
>> >     parse cpbands: 2483
>> >     parse fbands:  63
>> >     parse icbands: 16
>> >   write jar: 26642
>> >     build classf:  21111
>> >       sfattrs:     47
>> >       cfattrs:     0
>> >       fields:      218
>> >       interfaces:  0
>> >       methods:     362
>> >       addNested:   8146
>> >       inner:       3051
>> >       final:       8774
>> >     write classf:  1934
>> >       constpool:   1015
>> >       interfaces:  0
>> >       attributes:  31
>> >       methods:     827
>> >       fields:      31
>> >     write primit:  486
>> >
>> > That's the point where one can take the method and optimize it locally
>> :)
>> >
>> > Thanks,
>> > Aleksey.
>> >
>> > [1] https://issues.apache.org/jira/browse/HARMONY-5905
>> >
>> > On Wed, Jul 9, 2008 at 9:20 PM, Aleksey Shipilev
>> > <aleksey.shipilev@gmail.com> wrote:
>> >> I had disabled the compression in my test to throw away ZIP overhead
>> >> and focus on pack200 performance only. Thus the performance data is
>> >> not relevant to previous measurements. The data are assumed with
>> >> HARMONY-5900 incorporated.
>> >>
>> >> Harmony's pack200: 43 secs (3.5 Mb/secs)
>> >> Sun's pack200: 9 secs (16.6 Mb/secs)
>> >>
>> >> Profile:
>> >>
>> >> 22.0% java.util.HashMap.*
>> >> 11.4% java.io.FileInputStream.readBytes()
>> >>  7.5% o.a.h.unpack200.bytecode.ClassConstantPool.addNested()
>> >>  6.9% java.util.zip.*
>> >>  5.6% o.a.h.pack200.BHSDCodec.decode()
>> >>  4.8% java.lang.*
>> >>  4.4% o.a.h.unpack200.IcBands.getRelevantIcTuples()
>> >>  3.9%
>> o.a.h.unpack200.bytecode.forms.NoArgumentForm.setByteCodeOperands()
>> >>  3.2% o.a.h.unpack200.bytecode.ClassConstantPool.* (other)
>> >>  3.0% o.a.h.unpack200.bytecode.CodeAttribute.*
>> >>  2.8% java.io.FileOutputStream.writeBytes()
>> >>  2.8% o.a.h.unpack200.bytecode.ByteCode.*
>> >>  2.75% java.util.TreeMap.*
>> >>
>> >> Note ArrayList is gone!
>> >> It seems like BHSDCodec.decode(), IcBands.getRelevanticTuples() and
>> >> NoArgumentForm.setByteCodeOperands() are next candidates for tuning.
>> >> After that, the performance improvement is not possible without deep
>> >> changes, like overall algorithmic improvements. Anyway, that should be
>> >> first, but I'm not familiar with the code yet. This can't stop us
>> >> though ;)
>> >>
>> >> Thanks,
>> >> Aleksey.
>> >>
>> >> On Wed, Jul 9, 2008 at 7:27 PM, Sian January <
>> sianjanuary@googlemail.com> wrote:
>> >>> Thanks for doing that Aleksey.  In fact I think Sun's was 20 or 30
>> times
>> >>> faster before we started doing any performance optimizations, but it
>> looks
>> >>> like there's still some ground that we could make up!
>> >>>
>> >>>
>> >>>
>> >>> On 08/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I took the liberty of profiling of pack200 implementation on
>> unpacking
>> >>>> scenario. Source data was obtained from Eclipse JDT jars, repacked
in
>> >>>> single 60 Mb jar file, then packed with pack200 from Sun's JDK (-E9
>> >>>> used), resulting in 20 Mb pack200-compressed file. Then Sun JDK
>> >>>> 1.6.0_05 (Windows, -server) was used together with hprof (cpu=time)
>> to
>> >>>> obtain the profile. My patch from HARMONY-5900 is onboard. The head
>> of
>> >>>> the profile looks like this:
>> >>>>
>> >>>> 4.76%
>> org.apache.harmony.unpack200.bytecode.ClassConstantPool.addNested
>> >>>> 4.22% java.util.HashMap.getEntry
>> >>>> 2.99% java.util.AbstractList$Itr.next
>> >>>> 2.92% java.util.AbstractList$Itr.hasNext
>> >>>> 2.84% java.util.ArrayList.get
>> >>>> 2.43% java.util.AbstractList$Itr.next
>> >>>> 2.41% java.util.HashMap.containsKey
>> >>>> 2.15% org.apache.harmony.unpack200.IcBands.getRelevantIcTuples
>> >>>> 2.00% java.util.HashSet.contains
>> >>>> 1.57% java.io.DataOutputStream.writeUTF
>> >>>>
>> >>>> Composite occupancy:
>> >>>>
>> >>>> 18.4% java.util.AbstractList
>> >>>> 18.0% java.util.HashMap
>> >>>> 15.8% java.util.ArrayList
>> >>>> 10.5% o.a.h.unpack200.bytecode.ClassConstantPool.*
>> >>>> 5.3%  o.a.h.unpack200.bytecode.CPUTF8.* (hashcode mostly)
>> >>>> 4.5% java.io.*
>> >>>> 4.5% java.lang.String.*
>> >>>> 4.4% o.a.h.unpack200.bytecode.ByteCode.*
>> >>>> 3.9%  o.a.h.unpack200.bytecode.Ic{Tuple|Bands}.*
>> >>>> 14.7% other
>> >>>>
>> >>>> So the main concern is Collections usage. ClassConstantPool uses
>> Lists
>> >>>> excessively, so I suspect the significant amount of time is spent
>> >>>> there.
>> >>>>
>> >>>> NB:
>> >>>> Timings for the scenario (the less the better):
>> >>>> Harmony's pack200: 67 secs
>> >>>> Sun's pack200: 6 secs
>> >>>>
>> >>>> Yep, 10 times faster.
>> >>>>
>> >>>> Thanks,
>> >>>> Aleksey.
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Unless stated otherwise above:
>> >>> IBM United Kingdom Limited - Registered in England and Wales with
>> number
>> >>> 741598.
>> >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> 3AU
>> >>>
>> >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message