harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Garner <robin.gar...@anu.edu.au>
Subject Re: [performance] DaCapo benchmarks
Date Wed, 22 Nov 2006 03:19:37 GMT
I've also added performance results to the DaCapo regression tests.  If 
you page down on the front page

   http://cs.anu.edu.au/people/Robin.Garner/dacapo/regression/

You'll see a bunch of graphs, comparing the performance of JikesRVM with 
DRLVM.  The numbers are all relative to the best figure from any of the 
  commercial VMs I had available.

cheers

Stefano Mazzocchi wrote:
> Sergey Kuksenko wrote:
>> Stefano,
>> Trying to get the potential of Harmony I've quickly checked SciMak on tuned
>> Harmony release build and compared it with BEA & SUN.
> 
> Sergey,
> 
> many thanks for doing this.
> 
>> Hardware: P4 Xeon 3GHz
>> Windows XP SP2 (It's another platform, but I hope the key things are still
>> the same).
>>
>> BEA -
>> java version "1.5.0_06"
>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
>> BEA JRockit(R) (build R26.3.0-32-58710-1.5.0_06-20060308-2022-win-ia32, )
>>
>> SUN -
>> java version "1.5.0_06"
>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
>> Java HotSpot(TM) Client VM (build 1.5.0_06-b05, mixed mode)
>>
>>
>> Harmony -
>> Apache Harmony Launcher : (c) Copyright 1991, 2006 The Apache Software
>> Foundation or its licensors, as applicable.
>> java version "1.5.0"
>> pre-alpha : not complete or compatible
>> svn = 475925, (Nov 17 2006), Windows/ia32/msvc 1310, release build
>>
>> I've got the following results
>>
>> BEA (out of the box):
>>
>> Composite Score: 435.9674695335291
>> FFT (1024): 295.33366058958575
>> SOR (100x100):   474.15229982839213
>> Monte Carlo : 111.56918839504195
>> Sparse matmult (N=1000, nz=5000): 551.8821052631578
>> LU (100x100): 746.9000935914679
>> ----
>>
>> Sun (out of the box):
>>
>> Composite Score: 229.70779446543412
>> FFT (1024): 104.92303791891565
>> SOR (100x100):   400.44785722405015
>> Monte Carlo : 13.257380552894444
>> Sparse matmult (N=1000, nz=5000): 160.07814989061512
>> LU (100x100): 469.8325467406951
>>
>> ---
>>
>> Harmony (out of the box):
>>
>> Composite Score: 109.43208528481887
>> FFT (1024): 51.30119529411764
>> SOR (100x100):   257.9591618631154
>> Monte Carlo : 17.04568642272773
>> Sparse matmult (N=1000, nz=5000): 129.4666069618598
>> LU (100x100): 91.38777588227376
>> ----
>>
>> Harmony (tuned options, server path):
>>
>> Composite Score: 181.54555681031619
>> FFT (1024): 91.22597999162443
>> SOR (100x100):   329.8450882375011
>> Monte Carlo : 42.51432538579417
>> Sparse matmult (N=1000, nz=5000): 260.58050602943024
>> LU (100x100): 183.56188440723088
> 
> that's pretty good.
> 
>> ------
>>
>> When I looked into Harmony OOB I've found that all hot methods of SciMark
>> are compiled by JET (not recompiled by the optimizing JIT compiler). The
>> way
>> our DRLVM currrently recognises hot path is not sutable for SciMark becasue
>> of short run. 
> 
> Hmmm, what do you mean by "short run"? the entire app runs for a short
> amount of time total or each hot method runs for a short amount of time
> not enough to have it recognized as "hot"?
> 
>> We need to tune DRLVM options to get better results.
>> Tuned options give good SciMark score improvement (109->181).
> 
> Well, to be fair, all the other JVM could probably do the same.
> 
>> Which moves Harmony performance close to what Sun OOB shows.
> 
> excuse my ignorance, but what's OOB? (google define says "out of
> business" or "order of battle"... not sure they apply here ;-)
> 
>> Our client (default) compilation path was tuned a long time ago and it
>> probably makes sense to have another round. What we initially did was
>> running some script executing the given set of workloads trying to find the
>> best configuration for our VM. Having said that I suggest we choose the
>> right set of applications/benchmarks, so we can start our tuning once
>> again.
> 
> Maybe it's the analog microelectonic guy in me talking, but every time I
> hear something like "let's get reasonable defaults", I think of
> introducing a variation and a feedback to reach a local minimum and
> stabilize the system.
> 
> I know very little about how DRLVM works, but would it be feasible to
> start with such "reasonable defaults" and introduce a random variability
> to the way the JIT works alongside a very simple method profiler and see
> if the performance increase? think of you trying out different things
> and see if they work better... but done by the JVM as it runs.
> 
> Keep in mind I'm a total newbie in virtual machine design (or CPU
> architectures for that matter, despite my degree in microelectronics..
> well, to be fair, I was doing analog not digital circuits) so bear with
> me if I'm saying stupid things :-)
> 
>> Currently we have in mind the following list:
>> - HWA (Hello World Application)
>> - SciMark
>> - Dacapo (reasonable set of benchmarks, like fop, hsqldb, chart and xalan)
>> - Anything else?
>>
>> What do you think about this? Any additions to the list? Comments?
>> Questions?
> 
> The problem I have with this is that I feel that each one of such
> scenario might require different tuning parameters... and if that is the
> case, you end up with the 'short blanket' problem: you improve here and
> you decrease there.
> 
> An 'adaptive' scenario, on the other hand, would allow us to:
> 
>  1) avoid trying to find the optimal defaults (since we can't possible
> test every scenario that will be useful in a way that is consistent with
> real world usage)
> 
>  2) avoid the blanket problem, each VM can adapt to the scenario of use
> 
>  3) avoid the 'stiffness' problem, each VM can adapt to machine resource
> changes and 'retune' itself if the environment changes.
> 
> Of course, there is a price to pay in such 'fedback variability' systems
> since they have to find the minima over and over again.
> 
> So, another solution is to have a JVM "tuning parameters discovery mode"
> that you can run and you turn such "parameter finding" autoprofiling
> on... and the JVM dumps the tuning results for you on disk which you can
> later use to initialize the JVM on your own.
> 
> Not sure how feasible or complicated to write this is, but wow does this
> sound on paper?
> 
> 
>> Thanks,
>> ---
>> Sergey Kuksenko
>> Intel Enterprise Solutions Software Division
>>
>>
>> On 11/17/06, Stefano Mazzocchi <stefano@apache.org> wrote:
>>> Alexey Varlamov wrote:
>>>> Stefano,
>>>>
>>>> It is a bit unfair to compare *debug* build of Harmony with other
>>>> release versions :)
>>> I'm simulating what a journalist with a developer could do.
>>>
>>> If there is a way to make it compile in 'release mode' (if such a thing
>>> exists), I'll be very glad to redo the benchmarks.
>>>
>>>> I suppose all VMs where run in default mode (i.e. no special cmd-line
>>>> switches)?
>>> Right. No switches. I'm simulating what users do when they get the JVM:
>>> they run "java"... and if it's now fast enough they buy a new box.
>>>
>>> Having command line tuning parameters is mostly useless since most
>>> people don't know the internals of a JVM well enough to guess what
>>> parameters to tune anyway.
>>>
>>> So, what people will do once they get an harmony snapshot is "java
>>> my.class.Name <http://my.class.name/>" and see the results.
>>>
>>> I want to simulate that and compare it to the same exact experience they
>>> will get with other virtual machines for a variety of common scenarios
>>> (number crunching, xml processing, http serving, database load, etc...)
>>>
>>> I will focus on the server because that's there the apache action (and
>>> my personal interest) is.
>>>
>>> So, like I said, if there are 'compile time' switches that I can use to
>>> turn 'release mode' on, please tell me and I'll re-do the tests.
>>>
>>>> 2006/11/17, Stefano Mazzocchi <stefano@apache.org>:
>>>>> There are lies, damn lies and benchmarks.... which don't really tell
>>> you
>>>>> if an implementation of a program is *faster* but at least it tells
>>> you
>>>>> where you're at.
>>>>>
>>>>> So, as Geir managed to get the DSO linking problem go away in DRLVM,
I
>>>>> was able to start running some benchmarks.
>>>>>
>>>>> The machine is the following:
>>>>>
>>>>> Linux harmony-em64t 2.6.15-27-amd64-generic #1 SMP PREEMPT Sat Sep 16
>>>>> 01:50:50 UTC 2006 x86_64 GNU/Linux
>>>>>
>>>>> dual Intel(R) Pentium(R) D CPU 3.20GHz
>>>>> bogomips 6410.31 (per CPU)
>>>>>
>>>>> There is nothing else running on the machine (load is 0.04 at the time
>>>>> of testing).
>>>>>
>>>>> The various virtual machines tested are:
>>>>>
>>>>> harmony
>>>>> -------
>>>>> Apache Harmony Launcher : (c) Copyright 1991, 2006 The Apache Software
>>>>> Foundation or its licensors, as applicable.
>>>>> java version " 1.5.0"
>>>>> pre-alpha : not complete or compatible
>>>>> svn = r476006, (Nov 16 2006), Linux/em64t/gcc 4.0.3, debug build
>>>>>
>>>>> sun5
>>>>> ---
>>>>> java version "1.5.0_09 "
>>>>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_09-b03)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_09-b03, mixed mode)
>>>>>
>>>>> sun6
>>>>> ----
>>>>> java version " 1.6.0-rc"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0-rc-b104)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-rc-b104, mixed mode)
>>>>>
>>>>> ibm
>>>>> ---
>>>>> java version " 1.5.0"
>>>>> Java(TM) 2 Runtime Environment, Standard Edition (build
>>>>> pxa64dev-20061002a (SR3) )
>>>>> IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 Linux amd64-64
>>>>> j9vmxa6423-20061001 (JIT enabled)
>>>>> J9VM - 20060915_08260_LHdSMr
>>>>> JIT  - 20060908_1811_r8
>>>>> GC   - 20060906_AA)
>>>>> JCL  - 20061002
>>>>>
>>>>> bea
>>>>> ---
>>>>> java version "1.5.0_06 "
>>>>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
>>>>> BEA JRockit(R) (build
>>>>> R26.4.0-63-63688-1.5.0_06-20060626-2259-linux-x86_64, )
>>>>>
>>>>>
>>>>>
>>> --------------------------------------------------------------------------
>>>
>>>>>
>>>>> Test #1: java scimark2 (http://math.nist.gov/scimark2/)
>>>>>
>>>>> command: java jnt.scimark2.commandline
>>>>>
>>>>> NOTE: bigger number is better
>>>>>
>>>>> Sun6
>>>>> Composite Score: 364.5832265230057
>>>>> FFT (1024): 220.8458713892794
>>>>> SOR (100x100):   696.1542342357722
>>>>> Monte Carlo : 149.37978088875656
>>>>> Sparse matmult (N=1000, nz=5000): 326.37451873283845
>>>>> LU (100x100): 430.1617273683819
>>>>>
>>>>> BEA
>>>>> Composite Score: 359.13480378697835
>>>>> FFT (1024): 303.8746880751562
>>>>> SOR (100x100):   454.25628897202307
>>>>> Monte Carlo : 93.23913192138497
>>>>> Sparse matmult (N=1000, nz=5000): 530.44112637391
>>>>> LU (100x100): 413.8627835924175
>>>>>
>>>>> Sun5
>>>>> Composite Score: 332.84987587548574
>>>>> FFT (1024): 216.5144595799027
>>>>> SOR (100x100):   689.429322146947
>>>>> Monte Carlo : 25.791262124978065
>>>>> Sparse matmult (N=1000, nz=5000): 317.5193965699373
>>>>> LU (100x100): 414.99493895566377
>>>>>
>>>>> IBM
>>>>> Composite Score: 259.8249218693683
>>>>> FFT (1024): 296.8415012789055
>>>>> SOR (100x100):   428.974881649179
>>>>> Monte Carlo : 89.15159857584082
>>>>> Sparse matmult (N=1000, nz=5000): 144.3524241203982
>>>>> LU (100x100): 339.8042037225181
>>>>>
>>>>> Harmony
>>>>> Composite Score: 113.65082278962575
>>>>> FFT (1024): 203.76641991778123
>>>>> SOR (100x100):   224.37761309236748
>>>>> Monte Carlo : 9.063866256533116
>>>>> Sparse matmult (N=1000, nz=5000): 65.4051866327227
>>>>> LU (100x100): 65.6410280487242
>>>>>
>>>>> In this test harmony is clearly lagging behind... at about 30%
>>>>> performance of the best JVM, it's a little crappy. Please note how
>>> FFT's
>>>>> performance is not so bad awhile monte carlo is pretty bad compared to
>>>>> BEA or IBM.
>>>>>
>>>>> Overall, it seems like there is some serious work to do here to catch
>>> up.
>>>>>
>>> --------------------------------------------------------------------------
>>>
>>>>>
>>>>> Test 2: Dhrystones
>>> (http://www.c-creators.co.jp/okayan/DhrystoneApplet/
>>> )
>>>>> command: java dhry 100000000
>>>>>
>>>>> NOTE: bigger is better
>>>>>
>>>>> NB: I modified the code to accept the count at input from the command
>>>>> line!
>>>>>
>>>>> sun6:     8552856 dhrystones/sec
>>>>> sun5:     6605892
>>>>> bea:      5678914
>>>>> harmony:   669734
>>>>> ibm:       501562
>>>>>
>>>>> The performance here is horrific but what's surprising is that J9 is
>>>>> even worse. No idea what's going on but it seems like something is not
>>>>> working as it should (in both harmony and J9)
>>>>>
>>>>>
>>> --------------------------------------------------------------------------
>>>
>>>>>
>>>>> Test 3: Sieve (part of http://www.sax.de/~adlibit/tya18.tgz)
>>>>>
>>>>> command: java Sieve 30
>>>>>
>>>>> NB: I modified the test to run for a configurable amount of seconds.
>>>>>
>>>>> sun6     8545 sieves/sec
>>>>> sun5     8364
>>>>> bea      6174
>>>>> harmony  1836
>>>>> ibm       225
>>>>>
>>>>> IBM J9 clearly has something wrong on x86_64 but harmony is clearly
>>>>> lagging behind.
>>>>>
>>>>> Stay tuned for more tests.
>>>>>
>>>>> --
>>>>> Stefano.
>>>>>
>>>>>
>>>
>>> -- 
>>> Stefano.
>>>
>>>
> 
> 


-- 
Robin Garner
Dept. of Computer Science
Australian National University
http://cs.anu.edu.au/people/Robin.Garner/

Mime
View raw message