Return-Path: Delivered-To: apmail-incubator-harmony-dev-archive@www.apache.org Received: (qmail 93241 invoked from network); 27 Oct 2006 15:33:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Oct 2006 15:33:04 -0000 Received: (qmail 69440 invoked by uid 500); 27 Oct 2006 15:33:11 -0000 Delivered-To: apmail-incubator-harmony-dev-archive@incubator.apache.org Received: (qmail 69397 invoked by uid 500); 27 Oct 2006 15:33:11 -0000 Mailing-List: contact harmony-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: harmony-dev@incubator.apache.org Delivered-To: mailing list harmony-dev@incubator.apache.org Received: (qmail 69388 invoked by uid 99); 27 Oct 2006 15:33:11 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Oct 2006 08:33:11 -0700 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: 216.86.168.178 is neither permitted nor denied by domain of geir@pobox.com) Received: from [216.86.168.178] (HELO mxout-03.mxes.net) (216.86.168.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Oct 2006 08:32:56 -0700 Received: from [192.168.1.102] (unknown [67.86.14.213]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTP id B440A519AD for ; Fri, 27 Oct 2006 11:32:33 -0400 (EDT) Message-ID: <45422699.3050709@pobox.com> Date: Fri, 27 Oct 2006 11:32:41 -0400 From: "Geir Magnusson Jr." Reply-To: geir@pobox.com User-Agent: Thunderbird 1.5.0.7 (Macintosh/20060909) MIME-Version: 1.0 To: harmony-dev@incubator.apache.org Subject: Re: [classlib][performance] performance improvement for luni and nio_char modules - Harmony-1980 References: <4540F742.6010303@pobox.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Vladimir Strigun wrote: > Mikhail, > > It was pretty old build. Now I'm gathering info for the current DRLVM > (antlr, eclipse, xalan are still not included). > I've executed every benchmarks 10 times and the result is geometric > mean of the last 5 executions. > Machine: P4, 3Ghz, 1Gb RAM > Build1 = current Harmony build, svn = r468353, (Oct 27 2006), > Windows/ia32/msvc 1310, release build > Build2 = Build1+Harmony-1980 > RI: jdk1.5.0_06 > > Arguments for DRLVM: -Xem:server -Xms700m -Xmx700m > Arguments for Sun: -XX:+AggressiveHeap -XX:+UseBiasedLocking > -XX:+UseParallelGC -XX:ParallelGCThreads=4 -Xss64k -Xms700m -Xmx700m > > Results for small input: > Build1 Build2 RI > bloat 1014,371 1024,618 968,976 > chart 1427,912 1186,959 956,125 > fop 243,426 244,317 171,701 > hsqldb 330,856 324,493 549,55 > jython 1092,869 1102,331 568,088 > lusearch 1999,63 1971,813 1830,707 > luindex 421,703 225,073 594,78 > pmd 27,332 26,981 53,319 > > Average 482,5168816 434,5997662 481,3767025 > > Here we can see that DRLVM is a little bit faster, but recommendations > for Dacapo says that small workload is for testing and "either > reporting default or large in any performance analysis". > > Default input: > > Build1 Build2 RI > bloat 17155,441 17131,63 13718,637 > chart 13342,101 10924,038 9755,926 > fop 2621,146 2584,326 2353,304 > hsqldb 3153,212 3101,691 5737,304 > jython 16240,515 15632,52 8299,957 > lusearch 16280,762 16255,764 13518,751 > luindex 12420,638 10730,491 15782,563 > pmd 11027,172 11136,656 9689,841 > > Average 9538,259502 9063,946046 8638,4136 > > So, for default input we are 5-10% slower. > I'll provide the results for large input as soon as performance run > completed. I know that I'm going to be an annoying broken record here, filling up people mailboxes, but I'll say it again - that's mighty impressive. I'll take within 20% of Sun at this point in our project's life any day of the week. (Of course, world-class performance - as measured by SPECjbb is currently held by IBM's J9 on woodcrest, so that's probably the stretch target ;) geir > > Thanks, > Vladimir. > > On 10/27/06, Mikhail Fursov wrote: >> Vladimir, >> +1 more question: between TM integration and HARMONY-1942 incorrect >> behaviour of BBP could significantly slow down the execution. >> Did you do your measurements with Harmony-1942 applied? >> >> On 10/27/06, Vladimir Strigun wrote: >> > >> > Mikhail, >> > >> > Not yet. As I mentioned in the thread I'm still working on Dacapo. >> > I'll let you know if I find any improvements for JIT. >> > >> > Thanks, >> > Vladimir. >> > >> > On 10/27/06, Mikhail Fursov wrote: >> > > Vladimir, >> > > I see you removed some arraycopy operations in your patch as not >> > effective. >> > > I'm Ok with your solution but what to know if JIT could solve the >> > problem >> > > generating more effective code? Do you have any suggestions for JIT >> > here? >> > > >> > > On 10/27/06, Geir Magnusson Jr. wrote: >> > > > >> > > > 10%-15%? That's amazing. How fast are we (DRLVM) compared to >> Sun 1.5 >> > > > using decapo? >> > > > >> > > > geir >> > > > >> > > > >> > > > Vladimir Strigun wrote: >> > > > > The optimization covers the following issues: >> > > > > - java.nio.charset.CharsetDecoder and >> > java.nio.charset.CharsetEncoder >> > > > > Streaming decoding/encoding was removed. Analysis of API hotspots >> > for >> > > > > Dacapo shows that CharsetDecoder is frequently used almost in all >> > > > > benchmark, especially in chart. We already discussed >> advantages of >> > > > > streaming decoding but the fix shows significant performance >> > > > > improvement on average for all Dacapo benchmarks. For instance, >> > boost >> > > > > for chart benchmark is about 16%. Paulex, you recently worked in >> > > > > nio_char module and if I correctly remember you introduce >> streaming >> > > > > operations, so could you please review the changes and let me >> know? >> > > > > Since streaming operation was removed, tests have been slightly >> > > > > modified as well (previous version of tests fails on RI). >> > > > > - java.io.BufferedReader >> > > > > readLine() method was slightly modified. Additional check whether >> > some >> > > > > characters available in cached buffer was added prior to main >> cycle. >> > > > > - java.io.InputStreamReader >> > > > > Cached char buffer was removed, read() , read(char[], int, int) >> > > > > methods were rewritten. Current implementation of read(char[], >> int, >> > > > > int) uses several invocation of System.arraycopy. Proposed >> solution >> > > > > wraps char[] arguments within char buffer and therefore >> doesn't use >> > > > > arraycopy. Decoding operation is also produced inside the >> method, so >> > > > > fillBuf() has been removed >> > > > > >> > > > > Thoughts? Comments? >> > > > > >> > > > > Thanks, >> > > > > Vladimir. >> > > > > >> > > > > On 10/26/06, Vladimir Strigun (JIRA) wrote: >> > > > >> [classlib][performance] performance improvement for luni and >> > nio_char >> > > > >> modules >> > > > >> >> > > > >> > >> ----------------------------------------------------------------------------- >> >> > > > >> >> > > > >> >> > > > >> Key: HARMONY-1980 >> > > > >> URL: >> > http://issues.apache.org/jira/browse/HARMONY-1980 >> > > > >> Project: Harmony >> > > > >> Issue Type: Improvement >> > > > >> Components: Classlib >> > > > >> Reporter: Vladimir Strigun >> > > > >> Attachments: Harmony-1980.diff >> > > > >> >> > > > >> I've analyzed API frequently used in all Dacapo[1] benchmarks >> and >> > > > >> found several places in luni and nio_char modules that can be >> > > > >> improved. Suggested fix gives about 10-15% boost on average for >> > Dacapo >> > > > >> executed on DRLVM. I'll post more details to dev list. >> > > > >> Attached fix contains modifications for the following classes: >> > > > >> java.io.BufferedReader, java.io.InputStreamReader, >> > > > >> java.nio.charset.CharsetDecoder and >> java.nio.charset.CharsetEncoder >> > . >> > > > >> >> > > > >> Please have a look to the results of Dacapo execution (values >> are >> > in >> > > > >> millisec, so the less the better): >> > > > >> >> > > > >> Small workload >> > > > >> >> > > > >> OrigBuild Patched >> > > > >> bloat 996,078 1024,85 >> > > > >> chart 1240,777 1068,112 >> > > > >> fop 250,433 232,957 >> > > > >> hsqldb 348,942 361,139 >> > > > >> jython 831,143 824,775 >> > > > >> lusearch 1854,95 1870,969 >> > > > >> luindex 339,45 231,314 >> > > > >> pmd 29,704 23,638 >> > > > >> >> > > > >> >> > > > >> default workload >> > > > >> OrigBuild Patched >> > > > >> bloat 168733,562 175493,467 >> > > > >> chart 31651,792 25681,751 >> > > > >> fop 2546,289 2512,045 >> > > > >> hsqldb 22873,608 13555,515 >> > > > >> jython 128207,303 92863,28 >> > > > >> lusearch 29425,991 30064,153 >> > > > >> luindex 17825,795 18083,898 >> > > > >> pmd 44548,724 40225,694 >> > > > >> >> > > > >> >> > > > >> >> > > > >> [1] http://dacapobench.sourceforge.net >> > > > >> >> > > > >> >> > > > >> -- >> > > > >> This message is automatically generated by JIRA. >> > > > >> - >> > > > >> If you think it was sent incorrectly contact one of the >> > > > >> administrators: >> > > > http://issues.apache.org/jira/secure/Administrators.jspa >> > > > >> - >> > > > >> For more information on JIRA, see: >> > > > http://www.atlassian.com/software/jira >> > > > >> >> > > > >> >> > > > >> >> > > > > >> > > > >> > > >> > > >> > > >> > > -- >> > > Mikhail Fursov >> > > >> > > >> > >> >> >> >> -- >> Mikhail Fursov >> >> >