Return-Path: Delivered-To: apmail-harmony-dev-archive@www.apache.org Received: (qmail 74753 invoked from network); 9 Dec 2006 20:25:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Dec 2006 20:25:43 -0000 Received: (qmail 16728 invoked by uid 500); 9 Dec 2006 20:25:41 -0000 Delivered-To: apmail-harmony-dev-archive@harmony.apache.org Received: (qmail 16689 invoked by uid 500); 9 Dec 2006 20:25:41 -0000 Mailing-List: contact dev-help@harmony.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@harmony.apache.org Delivered-To: mailing list dev@harmony.apache.org Received: (qmail 16655 invoked by uid 99); 9 Dec 2006 20:25:41 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Dec 2006 12:25:41 -0800 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: 216.86.168.178 is neither permitted nor denied by domain of geir@pobox.com) Received: from [216.86.168.178] (HELO mxout-03.mxes.net) (216.86.168.178) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Dec 2006 12:25:28 -0800 Received: from [192.168.1.104] (unknown [67.86.14.213]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTP id 9565C5194D for ; Sat, 9 Dec 2006 15:24:39 -0500 (EST) Message-ID: <457B1B8D.8080605@pobox.com> Date: Sat, 09 Dec 2006 15:24:45 -0500 From: "Geir Magnusson Jr." Reply-To: geir@pobox.com User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: dev@harmony.apache.org Subject: Re: [DRLVM][GC] (HARMONY-2398) patch for GCv5 alloc helper inlining References: <9623c9a50612030046l161a8c3fk6f50be5b34562dc0@mail.gmail.com> <9623c9a50612041607o54948d82v66d4868f976e91ee@mail.gmail.com> <9623c9a50612050228i2cbddc72xcd6f966edbe662f3@mail.gmail.com> <12385bbd0612050414q7b54317eib639caf622f59b61@mail.gmail.com> <12385bbd0612050432t7682fb2ak17d50ba3f9a0c1b4@mail.gmail.com> <51d555c70612061044i397d438csda8a43a091761d37@mail.gmail.com> In-Reply-To: <51d555c70612061044i397d438csda8a43a091761d37@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Nice - any thoughts on where to focus for improved performance? geir Rana Dasgupta wrote: > Since the allocation helper is inlined now, I reran the old allocation rate > test( with the default heapsize 256 M ) ...while gc_gen and gc_cc are in > the > same ballpark, there is still some way to go to catch up with RI. Log > attached. > > > > > On 12/5/06, Mikhail Fursov wrote: >> >> If you compare performance of allocation - allocation fast path helper >> code >> is all you need. >> And we need to check performance not with microtests, but use real >> benchmarks. Microtests can hide cache misses in our example. >> >> On 12/5/06, Ivan Volosyuk wrote: >> > >> > Helper code is equal. GC code is not. Lets compare apples with oranges. >> > -- >> > Ivan >> > >> > On 12/5/06, Mikhail Fursov wrote: >> > > The helpers code is equal, except this load. So if we have different >> > > performance -> this extra memory access is the cause. >> > > >> > > On 12/5/06, Ivan Volosyuk wrote: >> > > > >> > > > I think in order to do this comparison, other conditions should be >> > > > equal. Comparing helper with 1 dependent load in gc_cc and helper >> with >> > > > 2 dependent loads in gc_v5 makes no sense to me. >> > >> >> >> >> -- >> Mikhail Fursov >> >> > > > ------------------------------------------------------------------------ > > gcgen default heapsize 256M > ============================= > Timing 50 million total object allocations > Varying number of threads and number of objects retained > > Timing 1 threads, retaining 64 Objects: > 3.625 seconds 210.20689655172413 MB/sec > Timing 1 threads, retaining 128 Objects: > 3.593 seconds 212.0790425827999 MB/sec > Timing 1 threads, retaining 256 Objects: > 3.579 seconds 212.90863369656327 MB/sec > Timing 1 threads, retaining 512 Objects: > 3.578 seconds 212.96813862493013 MB/sec > Timing 1 threads, retaining 1024 Objects: > 3.578 seconds 212.96813862493013 MB/sec > Timing 1 threads, retaining 2048 Objects: > 3.578 seconds 212.96813862493013 MB/sec > Timing 1 threads, retaining 4096 Objects: > 3.688 seconds 206.61605206073753 MB/sec > Timing 1 threads, retaining 8192 Objects: > 3.687 seconds 206.67209113100083 MB/sec > Timing 2 threads, retaining 64 Objects: > 5.344 seconds 142.58982035928142 MB/sec > Timing 2 threads, retaining 128 Objects: > 5.484 seconds 138.94967177242887 MB/sec > Timing 2 threads, retaining 256 Objects: > 5.485 seconds 138.92433910665451 MB/sec > Timing 2 threads, retaining 512 Objects: > 5.14 seconds 148.24902723735408 MB/sec > Timing 2 threads, retaining 1024 Objects: > 5.204 seconds 146.42582628747118 MB/sec > Timing 2 threads, retaining 2048 Objects: > 5.312 seconds 143.4487951807229 MB/sec > Timing 2 threads, retaining 4096 Objects: > 5.219 seconds 146.00498179727916 MB/sec > Timing 2 threads, retaining 8192 Objects: > 5.219 seconds 146.00498179727916 MB/sec > Timing 4 threads, retaining 64 Objects: > 6.265 seconds 121.62809257781325 MB/sec > Timing 4 threads, retaining 128 Objects: > 5.672 seconds 134.3441466854725 MB/sec > Timing 4 threads, retaining 256 Objects: > 5.531 seconds 137.76893870909421 MB/sec > Timing 4 threads, retaining 512 Objects: > 5.454 seconds 139.71397139713972 MB/sec > Timing 4 threads, retaining 1024 Objects: > 5.422 seconds 140.53854666174843 MB/sec > Timing 4 threads, retaining 2048 Objects: > 5.593 seconds 136.24173073484712 MB/sec > Timing 4 threads, retaining 4096 Objects: > 5.109 seconds 149.14856136230182 MB/sec > Timing 4 threads, retaining 8192 Objects: > 5.391 seconds 141.34668892598776 MB/sec > Timing 8 threads, retaining 64 Objects: > 5.594 seconds 136.21737575974257 MB/sec > Timing 8 threads, retaining 128 Objects: > 5.5 seconds 138.54545454545453 MB/sec > Timing 8 threads, retaining 256 Objects: > 5.516 seconds 138.14358230601886 MB/sec > Timing 8 threads, retaining 512 Objects: > 5.515 seconds 138.16863100634635 MB/sec > Timing 8 threads, retaining 1024 Objects: > 5.5 seconds 138.54545454545453 MB/sec > Timing 8 threads, retaining 2048 Objects: > 5.438 seconds 140.1250459727841 MB/sec > Timing 8 threads, retaining 4096 Objects: > 5.547 seconds 137.3715521903732 MB/sec > Timing 8 threads, retaining 8192 Objects: > 5.89 seconds 129.37181663837012 MB/sec > Timing 16 threads, retaining 64 Objects: > 5.828 seconds 130.7481125600549 MB/sec > Timing 16 threads, retaining 128 Objects: > 5.86 seconds 130.03412969283275 MB/sec > Timing 16 threads, retaining 256 Objects: > 5.859 seconds 130.0563236047107 MB/sec > Timing 16 threads, retaining 512 Objects: > 5.828 seconds 130.7481125600549 MB/sec > Timing 16 threads, retaining 1024 Objects: > 5.641 seconds 135.0824321928736 MB/sec > Timing 16 threads, retaining 2048 Objects: > 5.781 seconds 131.81110534509602 MB/sec > Timing 16 threads, retaining 4096 Objects: > 5.719 seconds 133.24007693652734 MB/sec > Timing 16 threads, retaining 8192 Objects: > 5.672 seconds 134.3441466854725 MB/sec > Timing 32 threads, retaining 64 Objects: > 5.688 seconds 133.9662447257384 MB/sec > Timing 32 threads, retaining 128 Objects: > 5.656 seconds 134.72418670438472 MB/sec > Timing 32 threads, retaining 256 Objects: > 5.656 seconds 134.72418670438472 MB/sec > Timing 32 threads, retaining 512 Objects: > 5.516 seconds 138.14358230601886 MB/sec > Timing 32 threads, retaining 1024 Objects: > 6.062 seconds 125.70108874958758 MB/sec > Timing 32 threads, retaining 2048 Objects: > 6.25 seconds 121.92 MB/sec > Timing 32 threads, retaining 4096 Objects: > 5.672 seconds 134.3441466854725 MB/sec > Timing 32 threads, retaining 8192 Objects: > 5.859 seconds 130.0563236047107 MB/sec > Total: 252.845 seconds > > gc4.1 default heapsize 256 M > =============================== > Timing 50 million total object allocations > Varying number of threads and number of objects retained > > Timing 1 threads, retaining 64 Objects: > 3.516 seconds 216.7235494880546 MB/sec > Timing 1 threads, retaining 128 Objects: > 3.484 seconds 218.71412169919634 MB/sec > Timing 1 threads, retaining 256 Objects: > 3.485 seconds 218.65136298421808 MB/sec > Timing 1 threads, retaining 512 Objects: > 3.484 seconds 218.71412169919634 MB/sec > Timing 1 threads, retaining 1024 Objects: > 3.5 seconds 217.71428571428572 MB/sec > Timing 1 threads, retaining 2048 Objects: > 3.531 seconds 215.80288870008496 MB/sec > Timing 1 threads, retaining 4096 Objects: > 3.516 seconds 216.7235494880546 MB/sec > Timing 1 threads, retaining 8192 Objects: > 3.594 seconds 212.02003338898163 MB/sec > Timing 2 threads, retaining 64 Objects: > 5.547 seconds 137.3715521903732 MB/sec > Timing 2 threads, retaining 128 Objects: > 5.406 seconds 140.9544950055494 MB/sec > Timing 2 threads, retaining 256 Objects: > 5.297 seconds 143.85501227109685 MB/sec > Timing 2 threads, retaining 512 Objects: > 5.687 seconds 133.98980130121328 MB/sec > Timing 2 threads, retaining 1024 Objects: > 5.282 seconds 144.2635365391897 MB/sec > Timing 2 threads, retaining 2048 Objects: > 5.593 seconds 136.24173073484712 MB/sec > Timing 2 threads, retaining 4096 Objects: > 5.032 seconds 151.4308426073132 MB/sec > Timing 2 threads, retaining 8192 Objects: > 5.765 seconds 132.17692974848222 MB/sec > Timing 4 threads, retaining 64 Objects: > 5.703 seconds 133.61388742766965 MB/sec > Timing 4 threads, retaining 128 Objects: > 5.375 seconds 141.7674418604651 MB/sec > Timing 4 threads, retaining 256 Objects: > 5.422 seconds 140.53854666174843 MB/sec > Timing 4 threads, retaining 512 Objects: > 5.532 seconds 137.74403470715836 MB/sec > Timing 4 threads, retaining 1024 Objects: > 5.375 seconds 141.7674418604651 MB/sec > Timing 4 threads, retaining 2048 Objects: > 5.359 seconds 142.19070722149655 MB/sec > Timing 4 threads, retaining 4096 Objects: > 5.531 seconds 137.76893870909421 MB/sec > Timing 4 threads, retaining 8192 Objects: > 5.422 seconds 140.53854666174843 MB/sec > Timing 8 threads, retaining 64 Objects: > 5.985 seconds 127.31829573934836 MB/sec > Timing 8 threads, retaining 128 Objects: > 6.406 seconds 118.95098345301281 MB/sec > Timing 8 threads, retaining 256 Objects: > 5.828 seconds 130.7481125600549 MB/sec > Timing 8 threads, retaining 512 Objects: > 5.61 seconds 135.82887700534758 MB/sec > Timing 8 threads, retaining 1024 Objects: > 5.593 seconds 136.24173073484712 MB/sec > Timing 8 threads, retaining 2048 Objects: > 5.625 seconds 135.46666666666667 MB/sec > Timing 8 threads, retaining 4096 Objects: > 5.625 seconds 135.46666666666667 MB/sec > Timing 8 threads, retaining 8192 Objects: > 5.625 seconds 135.46666666666667 MB/sec > Timing 16 threads, retaining 64 Objects: > 5.954 seconds 127.9811891165603 MB/sec > Timing 16 threads, retaining 128 Objects: > 5.625 seconds 135.46666666666667 MB/sec > Timing 16 threads, retaining 256 Objects: > 5.437 seconds 140.15081846606583 MB/sec > Timing 16 threads, retaining 512 Objects: > 5.438 seconds 140.1250459727841 MB/sec > Timing 16 threads, retaining 1024 Objects: > 5.719 seconds 133.24007693652734 MB/sec > Timing 16 threads, retaining 2048 Objects: > 5.953 seconds 128.00268772047707 MB/sec > Timing 16 threads, retaining 4096 Objects: > 5.422 seconds 140.53854666174843 MB/sec > Timing 16 threads, retaining 8192 Objects: > 5.484 seconds 138.94967177242887 MB/sec > Timing 32 threads, retaining 64 Objects: > 5.484 seconds 138.94967177242887 MB/sec > Timing 32 threads, retaining 128 Objects: > 5.563 seconds 136.97645155491642 MB/sec > Timing 32 threads, retaining 256 Objects: > 5.469 seconds 139.33077345035656 MB/sec > Timing 32 threads, retaining 512 Objects: > 5.422 seconds 140.53854666174843 MB/sec > Timing 32 threads, retaining 1024 Objects: > 5.422 seconds 140.53854666174843 MB/sec > Timing 32 threads, retaining 2048 Objects: > 5.406 seconds 140.9544950055494 MB/sec > Timing 32 threads, retaining 4096 Objects: > 5.391 seconds 141.34668892598776 MB/sec > Timing 32 threads, retaining 8192 Objects: > 5.563 seconds 136.97645155491642 MB/sec > Total: 250.502 seconds > > > RI heapsize 256M > ===================== > Timing 50 million total object allocations > Varying number of threads and number of objects retained > > Timing 1 threads, retaining 64 Objects: > 0.922 seconds 826.4642082429501 MB/sec > Timing 1 threads, retaining 128 Objects: > 0.906 seconds 841.0596026490066 MB/sec > Timing 1 threads, retaining 256 Objects: > 0.938 seconds 812.3667377398721 MB/sec > Timing 1 threads, retaining 512 Objects: > 0.953 seconds 799.5802728226653 MB/sec > Timing 1 threads, retaining 1024 Objects: > 1.031 seconds 739.0882638215326 MB/sec > Timing 1 threads, retaining 2048 Objects: > 1.172 seconds 650.1706484641638 MB/sec > Timing 1 threads, retaining 4096 Objects: > 1.422 seconds 535.8649789029536 MB/sec > Timing 1 threads, retaining 8192 Objects: > 3.0 seconds 254.0 MB/sec > Timing 2 threads, retaining 64 Objects: > 1.047 seconds 727.7936962750716 MB/sec > Timing 2 threads, retaining 128 Objects: > 1.015 seconds 750.7389162561577 MB/sec > Timing 2 threads, retaining 256 Objects: > 1.031 seconds 739.0882638215326 MB/sec > Timing 2 threads, retaining 512 Objects: > 1.079 seconds 706.2094531974051 MB/sec > Timing 2 threads, retaining 1024 Objects: > 1.156 seconds 659.1695501730104 MB/sec > Timing 2 threads, retaining 2048 Objects: > 1.265 seconds 602.3715415019764 MB/sec > Timing 2 threads, retaining 4096 Objects: > 1.344 seconds 566.9642857142857 MB/sec > Timing 2 threads, retaining 8192 Objects: > 2.797 seconds 272.43475151948513 MB/sec > Timing 4 threads, retaining 64 Objects: > 1.047 seconds 727.7936962750716 MB/sec > Timing 4 threads, retaining 128 Objects: > 1.062 seconds 717.5141242937852 MB/sec > Timing 4 threads, retaining 256 Objects: > 1.156 seconds 659.1695501730104 MB/sec > Timing 4 threads, retaining 512 Objects: > 1.125 seconds 677.3333333333334 MB/sec > Timing 4 threads, retaining 1024 Objects: > 1.141 seconds 667.8352322524102 MB/sec > Timing 4 threads, retaining 2048 Objects: > 1.281 seconds 594.8477751756441 MB/sec > Timing 4 threads, retaining 4096 Objects: > 1.328 seconds 573.7951807228916 MB/sec > Timing 4 threads, retaining 8192 Objects: > 1.563 seconds 487.5239923224568 MB/sec > Timing 8 threads, retaining 64 Objects: > 1.187 seconds 641.9545071609098 MB/sec > Timing 8 threads, retaining 128 Objects: > 1.188 seconds 641.4141414141415 MB/sec > Timing 8 threads, retaining 256 Objects: > 1.156 seconds 659.1695501730104 MB/sec > Timing 8 threads, retaining 512 Objects: > 1.156 seconds 659.1695501730104 MB/sec > Timing 8 threads, retaining 1024 Objects: > 1.109 seconds 687.1055004508567 MB/sec > Timing 8 threads, retaining 2048 Objects: > 1.313 seconds 580.3503427265804 MB/sec > Timing 8 threads, retaining 4096 Objects: > 1.359 seconds 560.7064017660044 MB/sec > Timing 8 threads, retaining 8192 Objects: > 1.407 seconds 541.5778251599147 MB/sec > Timing 16 threads, retaining 64 Objects: > 1.343 seconds 567.3864482501862 MB/sec > Timing 16 threads, retaining 128 Objects: > 1.282 seconds 594.383775351014 MB/sec > Timing 16 threads, retaining 256 Objects: > 1.25 seconds 609.6 MB/sec > Timing 16 threads, retaining 512 Objects: > 1.203 seconds 633.4164588528678 MB/sec > Timing 16 threads, retaining 1024 Objects: > 1.219 seconds 625.1025430680886 MB/sec > Timing 16 threads, retaining 2048 Objects: > 1.171 seconds 650.7258753202391 MB/sec > Timing 16 threads, retaining 4096 Objects: > 1.297 seconds 587.5096376252892 MB/sec > Timing 16 threads, retaining 8192 Objects: > 1.344 seconds 566.9642857142857 MB/sec > Timing 32 threads, retaining 64 Objects: > 1.438 seconds 529.90264255911 MB/sec > Timing 32 threads, retaining 128 Objects: > 1.609 seconds 473.586078309509 MB/sec > Timing 32 threads, retaining 256 Objects: > 1.469 seconds 518.720217835262 MB/sec > Timing 32 threads, retaining 512 Objects: > 1.437 seconds 530.2713987473903 MB/sec > Timing 32 threads, retaining 1024 Objects: > 1.266 seconds 601.8957345971564 MB/sec > Timing 32 threads, retaining 2048 Objects: > 1.282 seconds 594.383775351014 MB/sec > Timing 32 threads, retaining 4096 Objects: > 1.344 seconds 566.9642857142857 MB/sec > Timing 32 threads, retaining 8192 Objects: > 1.312 seconds 580.7926829268292 MB/sec > Total: 61.969 seconds > > >