harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao-Feng Li" <xiaofeng...@gmail.com>
Subject Re: [DRLVM][GC] (HARMONY-2398) patch for GCv5 alloc helper inlining
Date Sun, 10 Dec 2006 03:39:21 GMT
The allocation rate tuning in my opinion is actually the simplest part
for GC performance. The reason is, the allocation code sequence is
almost the same for all the bump pointer allocation. The allocation
helper inlining is expected to provide direct help.

Thanks,
xiaofeng

On 12/10/06, Geir Magnusson Jr. <geir@pobox.com> wrote:
> Nice - any thoughts on where to focus for improved performance?
>
> geir
>
>
> Rana Dasgupta wrote:
> > Since the allocation helper is inlined now, I reran the old allocation rate
> > test( with the default heapsize 256 M ) ...while gc_gen and gc_cc are in
> > the
> > same ballpark, there is still some way to go to catch up with RI. Log
> > attached.
> >
> >
> >
> >
> > On 12/5/06, Mikhail Fursov <mike.fursov@gmail.com> wrote:
> >>
> >> If you compare performance of allocation - allocation fast path helper
> >> code
> >> is all you need.
> >> And we need to check performance not with microtests, but use real
> >> benchmarks. Microtests can hide cache misses in our example.
> >>
> >> On 12/5/06, Ivan Volosyuk <ivan.volosyuk@gmail.com> wrote:
> >> >
> >> > Helper code is equal. GC code is not. Lets compare apples with oranges.
> >> > --
> >> > Ivan
> >> >
> >> > On 12/5/06, Mikhail Fursov <mike.fursov@gmail.com> wrote:
> >> > > The helpers code is equal, except this load. So if we have different
> >> > > performance -> this extra memory access is the cause.
> >> > >
> >> > > On 12/5/06, Ivan Volosyuk <ivan.volosyuk@gmail.com> wrote:
> >> > > >
> >> > > > I think in order to do this comparison, other conditions should
be
> >> > > > equal. Comparing helper with 1 dependent load in gc_cc and helper
> >> with
> >> > > > 2 dependent loads in gc_v5 makes no sense to me.
> >> >
> >>
> >>
> >>
> >> --
> >> Mikhail Fursov
> >>
> >>
> >
> >
> > ------------------------------------------------------------------------
> >
> > gcgen default heapsize 256M
> > =============================
> > Timing 50 million total object allocations
> > Varying number of threads and number of objects retained
> >
> > Timing  1 threads, retaining 64 Objects:
> > 3.625 seconds  210.20689655172413 MB/sec
> > Timing  1 threads, retaining 128 Objects:
> > 3.593 seconds  212.0790425827999 MB/sec
> > Timing  1 threads, retaining 256 Objects:
> > 3.579 seconds  212.90863369656327 MB/sec
> > Timing  1 threads, retaining 512 Objects:
> > 3.578 seconds  212.96813862493013 MB/sec
> > Timing  1 threads, retaining 1024 Objects:
> > 3.578 seconds  212.96813862493013 MB/sec
> > Timing  1 threads, retaining 2048 Objects:
> > 3.578 seconds  212.96813862493013 MB/sec
> > Timing  1 threads, retaining 4096 Objects:
> > 3.688 seconds  206.61605206073753 MB/sec
> > Timing  1 threads, retaining 8192 Objects:
> > 3.687 seconds  206.67209113100083 MB/sec
> > Timing  2 threads, retaining 64 Objects:
> > 5.344 seconds  142.58982035928142 MB/sec
> > Timing  2 threads, retaining 128 Objects:
> > 5.484 seconds  138.94967177242887 MB/sec
> > Timing  2 threads, retaining 256 Objects:
> > 5.485 seconds  138.92433910665451 MB/sec
> > Timing  2 threads, retaining 512 Objects:
> > 5.14 seconds  148.24902723735408 MB/sec
> > Timing  2 threads, retaining 1024 Objects:
> > 5.204 seconds  146.42582628747118 MB/sec
> > Timing  2 threads, retaining 2048 Objects:
> > 5.312 seconds  143.4487951807229 MB/sec
> > Timing  2 threads, retaining 4096 Objects:
> > 5.219 seconds  146.00498179727916 MB/sec
> > Timing  2 threads, retaining 8192 Objects:
> > 5.219 seconds  146.00498179727916 MB/sec
> > Timing  4 threads, retaining 64 Objects:
> > 6.265 seconds  121.62809257781325 MB/sec
> > Timing  4 threads, retaining 128 Objects:
> > 5.672 seconds  134.3441466854725 MB/sec
> > Timing  4 threads, retaining 256 Objects:
> > 5.531 seconds  137.76893870909421 MB/sec
> > Timing  4 threads, retaining 512 Objects:
> > 5.454 seconds  139.71397139713972 MB/sec
> > Timing  4 threads, retaining 1024 Objects:
> > 5.422 seconds  140.53854666174843 MB/sec
> > Timing  4 threads, retaining 2048 Objects:
> > 5.593 seconds  136.24173073484712 MB/sec
> > Timing  4 threads, retaining 4096 Objects:
> > 5.109 seconds  149.14856136230182 MB/sec
> > Timing  4 threads, retaining 8192 Objects:
> > 5.391 seconds  141.34668892598776 MB/sec
> > Timing  8 threads, retaining 64 Objects:
> > 5.594 seconds  136.21737575974257 MB/sec
> > Timing  8 threads, retaining 128 Objects:
> > 5.5 seconds  138.54545454545453 MB/sec
> > Timing  8 threads, retaining 256 Objects:
> > 5.516 seconds  138.14358230601886 MB/sec
> > Timing  8 threads, retaining 512 Objects:
> > 5.515 seconds  138.16863100634635 MB/sec
> > Timing  8 threads, retaining 1024 Objects:
> > 5.5 seconds  138.54545454545453 MB/sec
> > Timing  8 threads, retaining 2048 Objects:
> > 5.438 seconds  140.1250459727841 MB/sec
> > Timing  8 threads, retaining 4096 Objects:
> > 5.547 seconds  137.3715521903732 MB/sec
> > Timing  8 threads, retaining 8192 Objects:
> > 5.89 seconds  129.37181663837012 MB/sec
> > Timing  16 threads, retaining 64 Objects:
> > 5.828 seconds  130.7481125600549 MB/sec
> > Timing  16 threads, retaining 128 Objects:
> > 5.86 seconds  130.03412969283275 MB/sec
> > Timing  16 threads, retaining 256 Objects:
> > 5.859 seconds  130.0563236047107 MB/sec
> > Timing  16 threads, retaining 512 Objects:
> > 5.828 seconds  130.7481125600549 MB/sec
> > Timing  16 threads, retaining 1024 Objects:
> > 5.641 seconds  135.0824321928736 MB/sec
> > Timing  16 threads, retaining 2048 Objects:
> > 5.781 seconds  131.81110534509602 MB/sec
> > Timing  16 threads, retaining 4096 Objects:
> > 5.719 seconds  133.24007693652734 MB/sec
> > Timing  16 threads, retaining 8192 Objects:
> > 5.672 seconds  134.3441466854725 MB/sec
> > Timing  32 threads, retaining 64 Objects:
> > 5.688 seconds  133.9662447257384 MB/sec
> > Timing  32 threads, retaining 128 Objects:
> > 5.656 seconds  134.72418670438472 MB/sec
> > Timing  32 threads, retaining 256 Objects:
> > 5.656 seconds  134.72418670438472 MB/sec
> > Timing  32 threads, retaining 512 Objects:
> > 5.516 seconds  138.14358230601886 MB/sec
> > Timing  32 threads, retaining 1024 Objects:
> > 6.062 seconds  125.70108874958758 MB/sec
> > Timing  32 threads, retaining 2048 Objects:
> > 6.25 seconds  121.92 MB/sec
> > Timing  32 threads, retaining 4096 Objects:
> > 5.672 seconds  134.3441466854725 MB/sec
> > Timing  32 threads, retaining 8192 Objects:
> > 5.859 seconds  130.0563236047107 MB/sec
> > Total: 252.845 seconds
> >
> > gc4.1 default heapsize 256 M
> > ===============================
> > Timing 50 million total object allocations
> > Varying number of threads and number of objects retained
> >
> > Timing  1 threads, retaining 64 Objects:
> > 3.516 seconds  216.7235494880546 MB/sec
> > Timing  1 threads, retaining 128 Objects:
> > 3.484 seconds  218.71412169919634 MB/sec
> > Timing  1 threads, retaining 256 Objects:
> > 3.485 seconds  218.65136298421808 MB/sec
> > Timing  1 threads, retaining 512 Objects:
> > 3.484 seconds  218.71412169919634 MB/sec
> > Timing  1 threads, retaining 1024 Objects:
> > 3.5 seconds  217.71428571428572 MB/sec
> > Timing  1 threads, retaining 2048 Objects:
> > 3.531 seconds  215.80288870008496 MB/sec
> > Timing  1 threads, retaining 4096 Objects:
> > 3.516 seconds  216.7235494880546 MB/sec
> > Timing  1 threads, retaining 8192 Objects:
> > 3.594 seconds  212.02003338898163 MB/sec
> > Timing  2 threads, retaining 64 Objects:
> > 5.547 seconds  137.3715521903732 MB/sec
> > Timing  2 threads, retaining 128 Objects:
> > 5.406 seconds  140.9544950055494 MB/sec
> > Timing  2 threads, retaining 256 Objects:
> > 5.297 seconds  143.85501227109685 MB/sec
> > Timing  2 threads, retaining 512 Objects:
> > 5.687 seconds  133.98980130121328 MB/sec
> > Timing  2 threads, retaining 1024 Objects:
> > 5.282 seconds  144.2635365391897 MB/sec
> > Timing  2 threads, retaining 2048 Objects:
> > 5.593 seconds  136.24173073484712 MB/sec
> > Timing  2 threads, retaining 4096 Objects:
> > 5.032 seconds  151.4308426073132 MB/sec
> > Timing  2 threads, retaining 8192 Objects:
> > 5.765 seconds  132.17692974848222 MB/sec
> > Timing  4 threads, retaining 64 Objects:
> > 5.703 seconds  133.61388742766965 MB/sec
> > Timing  4 threads, retaining 128 Objects:
> > 5.375 seconds  141.7674418604651 MB/sec
> > Timing  4 threads, retaining 256 Objects:
> > 5.422 seconds  140.53854666174843 MB/sec
> > Timing  4 threads, retaining 512 Objects:
> > 5.532 seconds  137.74403470715836 MB/sec
> > Timing  4 threads, retaining 1024 Objects:
> > 5.375 seconds  141.7674418604651 MB/sec
> > Timing  4 threads, retaining 2048 Objects:
> > 5.359 seconds  142.19070722149655 MB/sec
> > Timing  4 threads, retaining 4096 Objects:
> > 5.531 seconds  137.76893870909421 MB/sec
> > Timing  4 threads, retaining 8192 Objects:
> > 5.422 seconds  140.53854666174843 MB/sec
> > Timing  8 threads, retaining 64 Objects:
> > 5.985 seconds  127.31829573934836 MB/sec
> > Timing  8 threads, retaining 128 Objects:
> > 6.406 seconds  118.95098345301281 MB/sec
> > Timing  8 threads, retaining 256 Objects:
> > 5.828 seconds  130.7481125600549 MB/sec
> > Timing  8 threads, retaining 512 Objects:
> > 5.61 seconds  135.82887700534758 MB/sec
> > Timing  8 threads, retaining 1024 Objects:
> > 5.593 seconds  136.24173073484712 MB/sec
> > Timing  8 threads, retaining 2048 Objects:
> > 5.625 seconds  135.46666666666667 MB/sec
> > Timing  8 threads, retaining 4096 Objects:
> > 5.625 seconds  135.46666666666667 MB/sec
> > Timing  8 threads, retaining 8192 Objects:
> > 5.625 seconds  135.46666666666667 MB/sec
> > Timing  16 threads, retaining 64 Objects:
> > 5.954 seconds  127.9811891165603 MB/sec
> > Timing  16 threads, retaining 128 Objects:
> > 5.625 seconds  135.46666666666667 MB/sec
> > Timing  16 threads, retaining 256 Objects:
> > 5.437 seconds  140.15081846606583 MB/sec
> > Timing  16 threads, retaining 512 Objects:
> > 5.438 seconds  140.1250459727841 MB/sec
> > Timing  16 threads, retaining 1024 Objects:
> > 5.719 seconds  133.24007693652734 MB/sec
> > Timing  16 threads, retaining 2048 Objects:
> > 5.953 seconds  128.00268772047707 MB/sec
> > Timing  16 threads, retaining 4096 Objects:
> > 5.422 seconds  140.53854666174843 MB/sec
> > Timing  16 threads, retaining 8192 Objects:
> > 5.484 seconds  138.94967177242887 MB/sec
> > Timing  32 threads, retaining 64 Objects:
> > 5.484 seconds  138.94967177242887 MB/sec
> > Timing  32 threads, retaining 128 Objects:
> > 5.563 seconds  136.97645155491642 MB/sec
> > Timing  32 threads, retaining 256 Objects:
> > 5.469 seconds  139.33077345035656 MB/sec
> > Timing  32 threads, retaining 512 Objects:
> > 5.422 seconds  140.53854666174843 MB/sec
> > Timing  32 threads, retaining 1024 Objects:
> > 5.422 seconds  140.53854666174843 MB/sec
> > Timing  32 threads, retaining 2048 Objects:
> > 5.406 seconds  140.9544950055494 MB/sec
> > Timing  32 threads, retaining 4096 Objects:
> > 5.391 seconds  141.34668892598776 MB/sec
> > Timing  32 threads, retaining 8192 Objects:
> > 5.563 seconds  136.97645155491642 MB/sec
> > Total: 250.502 seconds
> >
> >
> > RI heapsize 256M
> > =====================
> > Timing 50 million total object allocations
> > Varying number of threads and number of objects retained
> >
> > Timing  1 threads, retaining 64 Objects:
> > 0.922 seconds  826.4642082429501 MB/sec
> > Timing  1 threads, retaining 128 Objects:
> > 0.906 seconds  841.0596026490066 MB/sec
> > Timing  1 threads, retaining 256 Objects:
> > 0.938 seconds  812.3667377398721 MB/sec
> > Timing  1 threads, retaining 512 Objects:
> > 0.953 seconds  799.5802728226653 MB/sec
> > Timing  1 threads, retaining 1024 Objects:
> > 1.031 seconds  739.0882638215326 MB/sec
> > Timing  1 threads, retaining 2048 Objects:
> > 1.172 seconds  650.1706484641638 MB/sec
> > Timing  1 threads, retaining 4096 Objects:
> > 1.422 seconds  535.8649789029536 MB/sec
> > Timing  1 threads, retaining 8192 Objects:
> > 3.0 seconds  254.0 MB/sec
> > Timing  2 threads, retaining 64 Objects:
> > 1.047 seconds  727.7936962750716 MB/sec
> > Timing  2 threads, retaining 128 Objects:
> > 1.015 seconds  750.7389162561577 MB/sec
> > Timing  2 threads, retaining 256 Objects:
> > 1.031 seconds  739.0882638215326 MB/sec
> > Timing  2 threads, retaining 512 Objects:
> > 1.079 seconds  706.2094531974051 MB/sec
> > Timing  2 threads, retaining 1024 Objects:
> > 1.156 seconds  659.1695501730104 MB/sec
> > Timing  2 threads, retaining 2048 Objects:
> > 1.265 seconds  602.3715415019764 MB/sec
> > Timing  2 threads, retaining 4096 Objects:
> > 1.344 seconds  566.9642857142857 MB/sec
> > Timing  2 threads, retaining 8192 Objects:
> > 2.797 seconds  272.43475151948513 MB/sec
> > Timing  4 threads, retaining 64 Objects:
> > 1.047 seconds  727.7936962750716 MB/sec
> > Timing  4 threads, retaining 128 Objects:
> > 1.062 seconds  717.5141242937852 MB/sec
> > Timing  4 threads, retaining 256 Objects:
> > 1.156 seconds  659.1695501730104 MB/sec
> > Timing  4 threads, retaining 512 Objects:
> > 1.125 seconds  677.3333333333334 MB/sec
> > Timing  4 threads, retaining 1024 Objects:
> > 1.141 seconds  667.8352322524102 MB/sec
> > Timing  4 threads, retaining 2048 Objects:
> > 1.281 seconds  594.8477751756441 MB/sec
> > Timing  4 threads, retaining 4096 Objects:
> > 1.328 seconds  573.7951807228916 MB/sec
> > Timing  4 threads, retaining 8192 Objects:
> > 1.563 seconds  487.5239923224568 MB/sec
> > Timing  8 threads, retaining 64 Objects:
> > 1.187 seconds  641.9545071609098 MB/sec
> > Timing  8 threads, retaining 128 Objects:
> > 1.188 seconds  641.4141414141415 MB/sec
> > Timing  8 threads, retaining 256 Objects:
> > 1.156 seconds  659.1695501730104 MB/sec
> > Timing  8 threads, retaining 512 Objects:
> > 1.156 seconds  659.1695501730104 MB/sec
> > Timing  8 threads, retaining 1024 Objects:
> > 1.109 seconds  687.1055004508567 MB/sec
> > Timing  8 threads, retaining 2048 Objects:
> > 1.313 seconds  580.3503427265804 MB/sec
> > Timing  8 threads, retaining 4096 Objects:
> > 1.359 seconds  560.7064017660044 MB/sec
> > Timing  8 threads, retaining 8192 Objects:
> > 1.407 seconds  541.5778251599147 MB/sec
> > Timing  16 threads, retaining 64 Objects:
> > 1.343 seconds  567.3864482501862 MB/sec
> > Timing  16 threads, retaining 128 Objects:
> > 1.282 seconds  594.383775351014 MB/sec
> > Timing  16 threads, retaining 256 Objects:
> > 1.25 seconds  609.6 MB/sec
> > Timing  16 threads, retaining 512 Objects:
> > 1.203 seconds  633.4164588528678 MB/sec
> > Timing  16 threads, retaining 1024 Objects:
> > 1.219 seconds  625.1025430680886 MB/sec
> > Timing  16 threads, retaining 2048 Objects:
> > 1.171 seconds  650.7258753202391 MB/sec
> > Timing  16 threads, retaining 4096 Objects:
> > 1.297 seconds  587.5096376252892 MB/sec
> > Timing  16 threads, retaining 8192 Objects:
> > 1.344 seconds  566.9642857142857 MB/sec
> > Timing  32 threads, retaining 64 Objects:
> > 1.438 seconds  529.90264255911 MB/sec
> > Timing  32 threads, retaining 128 Objects:
> > 1.609 seconds  473.586078309509 MB/sec
> > Timing  32 threads, retaining 256 Objects:
> > 1.469 seconds  518.720217835262 MB/sec
> > Timing  32 threads, retaining 512 Objects:
> > 1.437 seconds  530.2713987473903 MB/sec
> > Timing  32 threads, retaining 1024 Objects:
> > 1.266 seconds  601.8957345971564 MB/sec
> > Timing  32 threads, retaining 2048 Objects:
> > 1.282 seconds  594.383775351014 MB/sec
> > Timing  32 threads, retaining 4096 Objects:
> > 1.344 seconds  566.9642857142857 MB/sec
> > Timing  32 threads, retaining 8192 Objects:
> > 1.312 seconds  580.7926829268292 MB/sec
> > Total: 61.969 seconds
> >
> >
> >
>

Mime
View raw message