harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wenlong Li" <wenl...@gmail.com>
Subject Re: [VM] On-demand class library parsing is ready to commit
Date Thu, 25 Dec 2008 07:16:48 GMT
Hey,

I did an experiment using Alexey's script on a Linux machine (Pentium
D - Intel(R) XEON(TM) CPU 2.00GHz, 1GB main memory, 36GB SCSI hard
disk at 10K RPM, 8MB disk cache, Linux OS 2.6.21-1.3194.fc7,
Harmony-721077, HWA(2) micro-bench)

The performance result is
cold running:
clean: 3375 ms, VM_Creation: 2055 ms
on-demand: 2849 ms, VM_creation: 1435 ms
There is a 30% gain for VM creation

warm-up running:
clean: 1825 ms, VM_Creation:454 ms
on-demand: 1818 ms, VM_Creation: 420 ms
There is a 7% gain for VM creation.

The VM destroy module takes around 1 second for both cold and warm-up
runnings, but not stable (I issue another JIRA for this problem)

The overal observation is on-demand parsing helps cold running (30%
gain), and non-negative benefit for warm-up running. This is expected
because such optimization is only for VM creation module by reducing
the amount of work in Jar parsing.

Here is the summary:
(a) The on-demand class library parsing is optimizing the startup work
in VM creation module, and it contributes to ~30% performance gain on
my test bed.
(b) Such optimization doesn't hurt performance for warm-up running
(c) As this optimization is for VM creation module, it doesn't help
SPECJVM2008 startup benchmarks (e.g., no performance gain)
(d) The performance benefit of this optimization varies with different
systems (Alexey sees 17% gain, and I reported 30% here. On my windows
system (2.83 GHz Core 2 Quad-core, 3GB main memory, 250GB SATA HD at
7200 RPM, 8MB disk cache, WinXp system, Harmony-713673). This
optimization reduces VM creation time from 20+ seconds to 3+ seconds
in a cold running case).
(e) For the consideration of modularity, if I move the update from
class_lib to VM, does Harmony want to include this optimization? As
this takes extra effort, I want to hear your comments in advance.

btw, I put my time instrumentation code in
https://issues.apache.org/jira/browse/HARMONY-6039

Thx for your suggestion, all.

Wenlong

On Mon, Dec 22, 2008 at 4:04 AM, Aleksey Shipilev
<aleksey.shipilev@gmail.com> wrote:
> Hi Wenlong,
>
> I had some performance experiments with your patch. The test system is:
>  - Pentium D 820 2.8 Ghz / 2 Gb DDR2-667
>  - WD 3200KS, 320 Gb, 16 Mb cache
>  - Gentoo Linux x86, 2.6.23
>  - Harmony r728459
>  - SPECjvm2008
>
> To recreate the stressful conditions over and over the simple script
> was written [1]. The script invalidates the caches before actually
> starting the workload: re-reads the same 64 Mb file a couple of times
> to fill out on-HDD cache, invalidating VFS block caches first to make
> sure the data is really requested from the disk.
>
> On HWA [2] these performance results were produced:
>
> "cold-start" (invalidate caches):
> clean: (5.24 +- 0.28) secs
> ondemand: (4.49 +- 0.17) secs
>
> "warm-start" (don't invalidate caches);
> clean: (2.82 +- 0.01) secs
> ondemand: (2.80 +- 0.02) secs
>
> That is, on-demand patch does bring +17% (-+9%) improvement on HWA
> when running with flushed caches, and does not bring any performance
> improvement in warm mode.
>
> As I mentioned several times, this test does not reflect the real
> performance end user would perceive, so I took two SPECjvm2008:startup
> benchmarks and run each of them 10x10 times.
>
> SPECjvm2008:startup.helloworld, "cold start":
> clean: (8.93 +- 0.21) ops/min
> ondemand: (9.04 +- 0.03) ops/min
>
> SPECjvm2008:startup.compiler.compiler, "cold start":
> clean: (1.44 +- 0.05) ops/min
> ondemand: (1.42 +- 0.04) ops/min
>
> As you can see even in very stressful situation there's no boost. I
> would find these performance results unconvincing to change the
> infrastructure of boolclasspath resolution. Am I missing something
> important?
>
> Thanks,
> Aleksey.
>
> [1] run.sh
> #!/bin/bash
>
> R=`pwd`
>
> JAVA=$R/platforms/builds/harmony-release-clean/jdk/jre/bin/java
> #JAVA=$R/platforms/builds/harmony-release-ondemand/jdk/jre/bin/java
> JAVA_OPTS="-Xmx1024M -Xms1024M"
>
> for T in `seq 1 10`; do
>
>        echo "*************** EXECUTING ITERATION $T ****************"
>
>        # invalidate HDD caches
>        #   - need to replace all entries in LRU HDD cache
>        #   - flush the kernel VFS cache first to ensure the data
> would be read from disk
>
>        echo "Flushing caches"
>        for I in `seq 1 5`; do
>                sync
>                echo 3 > /proc/sys/vm/drop_caches
>
>                dd if=cachekiller.file of=/dev/null > /dev/null 2>&1
>        done
>
>        echo "Executing."
>
>        # HelloWorld
>        /usr/bin/time $JAVA $JAVA_OPTS -cp benchmarks/ HelloWorld 2>&1
>
>        # SPECjvm2008
>        #cd $R/benchmarks/storage/SPECjvm2008
>        #/usr/bin/time $JAVA $JAVA_OPTS -Djava.awt.headless=true -jar
> SPECjvm2008.jar -ikv -i 10 startup.compiler.compiler 2>&1
>
>        echo ""
> done
>
> [2] HelloWorld.java
> public class HelloWorld {
> public static void main(String[] args) {
>        System.out.println("Hello, world!");
> }
> }
>
>
> On Sun, Dec 21, 2008 at 6:02 AM, Wenlong Li <wenlong@gmail.com> wrote:
>> On Sat, Dec 20, 2008 at 7:10 PM, Alexei Fedotov
>> <alexei.fedotov@gmail.com> wrote:
>>> Wenlong,
>>> Thanks for removing the commented code.
>>>
>>> There are several VMs which make use of the Harmony class library,
>>> e.g. Harmony VM, J9, Android Dalvik, etc. Your change is Harmony VM
>>> specific, isn't it? If it is, then it's better to keep related changes
>>> in the VM module. If it is not, then it might be a good idea to keep
>>> the changes in the class library module unless other VMs already has
>>> such optimization in their code.
>> [Wenlong] Though at this moment, you can think on-demand class parsing
>> is a specif optimization from your point of view. I believe it could
>> be a general technique, e.g., it can be easily deployed in other
>> runtime systems. Current VM also depends on the luniglobal.c in
>> working_classlib to get all class libraries/modules. e.g., there is a
>> cross-module dependence between classlib and VM. When user wants to
>> add new module, they should manually change the
>> bootclasspath.properties, while if applying this patch, user should
>> revise my added property file instead of the bootclasspath.properties.
>> I understand modifying bootclasspath file may be a specification.
>>>
>>> In any case crossing module boundary would make class library users
>>> think more than once or even write some code. Is it technically
>>> possible to prepare a patch which does not change module boundaries?
>>> What do you think?
>> [Wenlong] Yes, it is possible from technical perspective, but a little
>> complicated. I can think about it. :)
>>
>>>
>>> As for your performance experiments, which particular test are your
>>> measuring? It is bootclasspath-unpretentious "Hello, world", isn't it?
>> [Wenlong] My startup means the work executed before running user's
>> computation. That is, the vm creation time. I manually add
>> instrumentation code for execution time in JNI_CreateJavaVM of
>> JNI.cpp. This startup work is common for any benchmarks. My experiment
>> was conducted on both Windows and Linux system. Please see my previous
>> message about performance gain from this optimization.
>>
>> Thx,
>> Wenlong
>>>
>>> Thanks!
>>>
>>> On Sat, Dec 20, 2008 at 2:19 AM, Wenlong Li <wenlong@gmail.com> wrote:
>>>> On Sat, Dec 20, 2008 at 12:42 AM, Alexei Fedotov
>>>> <alexei.fedotov@gmail.com> wrote:
>>>>> Wenlong,
>>>>> Have I missed a discussion of the proposed design? I see that you
>>>>> expose a new public interface:
>>>>>  /**
>>>>>  * @map the jar with exported package in the pending jar list for
>>>>> on-demand jar parsing
>>>>>  *   Key is the jar, and value is the package exported by this jar
>>>>>  */
>>>>> DECLARE_OPEN(void, vm_properties_set_pending_jar, (const char* key,
>>>>> const char* value));
>>>>>
>>>>> Did you mean "Maps" instead of "@map"? Strangely the word "pending"
>>>>> disappeared from the name of the wrapping VMI interface
>>>>> SetJarPackageMapping . Why should we extend both OPEN and VMI
>>>>> interfaces with the same function? Why did you put your code into
>>>>> working_classlib/modules/luni/src/main/native/luni/shared/luniglob.c,
>>>>> thus introducing another dependency between VM and class library?
>>>> [Wenlong] The boot class path is defined in luniglobal.c in Harmony,
>>>> and it also has dependence with VM. In my understanding, my patch is
>>>> related to boot class path determination, so I also put my code in
>>>> luniglobal.c, and use VMI interface to communicate with VM.
>>>>
>>>>>
>>>>> +            //rcSetProperty = (*vmInterface)->SetJarPackageMapping
>>>>> (vmInterface, jarName, jarValue);
>>>>> +            /*
>>>>> +            hymem_free_memory(jarName);
>>>>> +            hymem_free_memory(jarValue);
>>>>> +            */
>>>>> Should we really commit the commented code?
>>>>> Thanks.
>>>>
>>>> [Wenlong] Please see my latest version of patch in the list. Such
>>>> commented code has been removed.
>>>>>
>>>>>
>>>>> On Fri, Dec 19, 2008 at 6:59 PM, Tim Ellison <t.p.ellison@gmail.com>
wrote:
>>>>>> I was hoping that somebody else would comment first, so I don't have
to
>>>>>> be the grumpy one all the time :-)
>>>>>>
>>>>>> As I said before, this is good prototyping work...
>>>>>>
>>>>>> Wenlong Li wrote:
>>>>>>> I did the pre-commit test on the patch of on-demand class library
>>>>>>> parsing (https://issues.apache.org/jira/browse/HARMONY-6039),
and it
>>>>>>> works well now.
>>>>>>> Can Harmony incorporate this feature?
>>>>>>
>>>>>> I'm not sure it is ready for committing to the head stream yet.
>>>>>>
>>>>>>> Via on-demand class parsing, we can reduce startup time from
20+
>>>>>>> seconds to 3 seconds for cold runing, and 170 ms to 140 ms for
warm-up
>>>>>>> running on Core 2 Duo with Windows.
>>>>>>
>>>>>> Can you tell me how to reproduce 20+sec cold start-up?  I haven't
seen
>>>>>> anything like that in my simple tests.
>>>>>>
>>>>>>> After applying the patch, please note there is some change to
add new modules.
>>>>>>> (1) If you want to add new modules/libraries, please don't put
them in
>>>>>>> the bootclasspath.properties file. This file now only saves modules
>>>>>>> needed during startup (the VM startup only accesses class libraries
in
>>>>>>> eight modules)
>>>>>>
>>>>>> That would break too much.  How about creating a new file rather
than
>>>>>> re-purposing an existing file with different semantics?  This file
is
>>>>>> used by Jikes, IBM VME, the Eclipse plug-in, at least.
>>>>>>
>>>>>>> (2) For new modules/libraries, please put them in the
>>>>>>> modulelibrarymapping.properties file. You should specify the
module
>>>>>>> name and its exported class library. Here is one example:
>>>>>>> math.jar=java.math, where "math.jar" means the module name, and
>>>>>>> "java.math" means the class libraries this module exports.
>>>>>>
>>>>>> As we discussed on another thread, its unclear if the time is spent
in
>>>>>> following the slow indexing through the classpath/JAR directories,
or
>>>>>> whether it is speed of loading bytes once we know what we need. 
I think
>>>>>> that it is premature to abandon the JAR manifest data as the principal
>>>>>> source of metadata until we understand the problem this solves.
>>>>>>
>>>>>> Can we measure where the time is spent in the current implementation?
>>>>>> I think it will help guide this approach to a better solution.
>>>>>> What tools do you recommend for profiling start-up?
>>>>>>
>>>>>> Regards
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> С уважением,
>>>>> Алексей Федотов,
>>>>> ЗАО «Телеком Экспресс»
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> С уважением,
>>> Алексей Федотов,
>>> ЗАО «Телеком Экспресс»
>>>
>>
>
Mime
View raw message