harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Shipilev" <aleksey.shipi...@gmail.com>
Subject Re: [classlib][pack200][performance] Profiling unpacking scenario
Date Tue, 05 Aug 2008 22:16:04 GMT
Sian, Andrew,

An update here. I had updated the profiler [1]  and run it over on JDT
unpacking scenario several times. That's what we have today (times are
msecs, indentation resembles call hierarchy):

Unpack: 220529

 segment read:  28853
   cpBands:     13791
   adBands:     214
   icBands:     343
   cbBands:     9158
   bcBands:     4694
   fbBands:     118
   fBits:       407

 segment parse: 176929
   header:      0
   cpBands:     0
   adBands:     0
   icBands:     1
   cbBands:     0

   bcBands:     77245
      exceptn:  656
      newCA:    71145
       getBC:   16372
       extOpnd: 26776
       fixup:   1885
      methAttr: 591
      curAttr:  1808

   fbBands:     0

   buildCF:     82057
     ccp.addN:  36182
     ccp.addNW: 433
     ccp.resv:  27452
     ic.getIC:  11717

   cfWrite:     17379

 segment write: 14349


As you can see in commits, I had filed a couple of JIRAs with the
bunch of pack200 optimizations [2,3,4,5], here what I got with all
them applied:

Unpack: 193165 (-14% in total)

 segment read:  28334
   cpBands:     13034
   adBands:     249
   icBands:     341
   cbBands:     9299
   bcBands:     4645
   fbBands:     169
   fBits:       459

 segment parse: 150032
   header:      1
   cpBands:     0
   adBands:     0
   icBands:     82
   cbBands:     0

   bcBands:     73936
      exceptn:  633
      newCA:    67871
       getBC:   13874    <--- (-18% due to [2])
       extOpnd: 26583
       fixup:   1808
      methAttr: 615
      curAttr:  1663

   fbBands:     0

   buildCF:     58319
     ccp.addN:  26199  <---- (-38% due to [5])
     ccp.addNW: 424
     ccp.resv:  23642   <---- (-16% due to [5])
     ic.getIC:  2245    <--- (-80% due to [3,4])

   cfWrite:     17463

 segment write: 14413


Of course, the boosts are diminished with the performance overheads of
profiling. But still, this profile gives pretty good insight on what's
going on. CodeAttribute ["newCA" is the "new CodeAttribute(...)"] is
the next candidate for optimization, I guess.

Sian, Andrew, can you please review the patches? I'm particularly
interested in [5], because it's proof-of-concept and kind of
controversial.

Thanks,
Aleksey.

[1] "classlib][pack200] Internal profiler for pack200"
https://issues.apache.org/jira/browse/HARMONY-5905

[2] [classlib][pack200][performance] java.util.HashMap usage optimization
https://issues.apache.org/jira/browse/HARMONY-5928

[3] [classlib][pack200][performance] Segment.computeIcStored rewrite
https://issues.apache.org/jira/browse/HARMONY-5929

[4] [classlib][pack200][performance] IcBands.getRelevantIcTuples rewrite
https://issues.apache.org/jira/browse/HARMONY-5930

[5] [classlib][pack200][performance] Some ClassConstantPool content
may not be needed
https://issues.apache.org/jira/browse/HARMONY-5931


On Thu, Jul 10, 2008 at 7:59 PM, Aleksey Shipilev
<aleksey.shipilev@gmail.com> wrote:
> I had quickly drafted the internal Java profiler for pack200 at [1].
> Here are the results of profiling for 50Mb Eclipse JDT jar, times are
> microsecs, identation emulates the call tree. Some of the label are
> not distinguishable, but you may look up probe positions in the patch.
>
> Unpack: 38311
>  segment unpack: 38217
>   parse segment: 11575
>     parse header:  0
>     parse ADB:     78
>     parse bcbands: 5342
>       parse1:      453
>       parse2:      93
>       select:      252
>       attrlayout:  0
>       methods:     3997
>     parse cbands:  3358
>       classattr:   1002
>       code:        1636
>       fields:      173
>       methods:     515
>     parse cpbands: 2483
>     parse fbands:  63
>     parse icbands: 16
>   write jar: 26642
>     build classf:  21111
>       sfattrs:     47
>       cfattrs:     0
>       fields:      218
>       interfaces:  0
>       methods:     362
>       addNested:   8146
>       inner:       3051
>       final:       8774
>     write classf:  1934
>       constpool:   1015
>       interfaces:  0
>       attributes:  31
>       methods:     827
>       fields:      31
>     write primit:  486
>
> That's the point where one can take the method and optimize it locally :)
>
> Thanks,
> Aleksey.
>
> [1] https://issues.apache.org/jira/browse/HARMONY-5905
>
> On Wed, Jul 9, 2008 at 9:20 PM, Aleksey Shipilev
> <aleksey.shipilev@gmail.com> wrote:
>> I had disabled the compression in my test to throw away ZIP overhead
>> and focus on pack200 performance only. Thus the performance data is
>> not relevant to previous measurements. The data are assumed with
>> HARMONY-5900 incorporated.
>>
>> Harmony's pack200: 43 secs (3.5 Mb/secs)
>> Sun's pack200: 9 secs (16.6 Mb/secs)
>>
>> Profile:
>>
>> 22.0% java.util.HashMap.*
>> 11.4% java.io.FileInputStream.readBytes()
>>  7.5% o.a.h.unpack200.bytecode.ClassConstantPool.addNested()
>>  6.9% java.util.zip.*
>>  5.6% o.a.h.pack200.BHSDCodec.decode()
>>  4.8% java.lang.*
>>  4.4% o.a.h.unpack200.IcBands.getRelevantIcTuples()
>>  3.9% o.a.h.unpack200.bytecode.forms.NoArgumentForm.setByteCodeOperands()
>>  3.2% o.a.h.unpack200.bytecode.ClassConstantPool.* (other)
>>  3.0% o.a.h.unpack200.bytecode.CodeAttribute.*
>>  2.8% java.io.FileOutputStream.writeBytes()
>>  2.8% o.a.h.unpack200.bytecode.ByteCode.*
>>  2.75% java.util.TreeMap.*
>>
>> Note ArrayList is gone!
>> It seems like BHSDCodec.decode(), IcBands.getRelevanticTuples() and
>> NoArgumentForm.setByteCodeOperands() are next candidates for tuning.
>> After that, the performance improvement is not possible without deep
>> changes, like overall algorithmic improvements. Anyway, that should be
>> first, but I'm not familiar with the code yet. This can't stop us
>> though ;)
>>
>> Thanks,
>> Aleksey.
>>
>> On Wed, Jul 9, 2008 at 7:27 PM, Sian January <sianjanuary@googlemail.com> wrote:
>>> Thanks for doing that Aleksey.  In fact I think Sun's was 20 or 30 times
>>> faster before we started doing any performance optimizations, but it looks
>>> like there's still some ground that we could make up!
>>>
>>>
>>>
>>> On 08/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I took the liberty of profiling of pack200 implementation on unpacking
>>>> scenario. Source data was obtained from Eclipse JDT jars, repacked in
>>>> single 60 Mb jar file, then packed with pack200 from Sun's JDK (-E9
>>>> used), resulting in 20 Mb pack200-compressed file. Then Sun JDK
>>>> 1.6.0_05 (Windows, -server) was used together with hprof (cpu=time) to
>>>> obtain the profile. My patch from HARMONY-5900 is onboard. The head of
>>>> the profile looks like this:
>>>>
>>>> 4.76% org.apache.harmony.unpack200.bytecode.ClassConstantPool.addNested
>>>> 4.22% java.util.HashMap.getEntry
>>>> 2.99% java.util.AbstractList$Itr.next
>>>> 2.92% java.util.AbstractList$Itr.hasNext
>>>> 2.84% java.util.ArrayList.get
>>>> 2.43% java.util.AbstractList$Itr.next
>>>> 2.41% java.util.HashMap.containsKey
>>>> 2.15% org.apache.harmony.unpack200.IcBands.getRelevantIcTuples
>>>> 2.00% java.util.HashSet.contains
>>>> 1.57% java.io.DataOutputStream.writeUTF
>>>>
>>>> Composite occupancy:
>>>>
>>>> 18.4% java.util.AbstractList
>>>> 18.0% java.util.HashMap
>>>> 15.8% java.util.ArrayList
>>>> 10.5% o.a.h.unpack200.bytecode.ClassConstantPool.*
>>>> 5.3%  o.a.h.unpack200.bytecode.CPUTF8.* (hashcode mostly)
>>>> 4.5% java.io.*
>>>> 4.5% java.lang.String.*
>>>> 4.4% o.a.h.unpack200.bytecode.ByteCode.*
>>>> 3.9%  o.a.h.unpack200.bytecode.Ic{Tuple|Bands}.*
>>>> 14.7% other
>>>>
>>>> So the main concern is Collections usage. ClassConstantPool uses Lists
>>>> excessively, so I suspect the significant amount of time is spent
>>>> there.
>>>>
>>>> NB:
>>>> Timings for the scenario (the less the better):
>>>> Harmony's pack200: 67 secs
>>>> Sun's pack200: 6 secs
>>>>
>>>> Yep, 10 times faster.
>>>>
>>>> Thanks,
>>>> Aleksey.
>>>>
>>>
>>>
>>>
>>> --
>>> Unless stated otherwise above:
>>> IBM United Kingdom Limited - Registered in England and Wales with number
>>> 741598.
>>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>>
>>
>

Mime
View raw message