harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Shipilev" <aleksey.shipi...@gmail.com>
Subject Re: [VM] On-demand class library parsing is ready to commit
Date Mon, 22 Dec 2008 07:48:52 GMT
One important thing to realize is, the invocation of main() induces a
waterfall of various Java classes (mostly kernel) to initialize.
Should they be considered in VM startup time? If yes, then the
measurements on empty main() method give just the startup time as
defined by Wenlong. Assuming there's virtually no work on shutdown and
VM just exits.

Thanks,
Aleksey.

On Mon, Dec 22, 2008 at 10:21 AM, Aleksey Shipilev
<aleksey.shipilev@gmail.com> wrote:
> Nathan:
>
> Wenlong's definition is "startup time is the time required for VM to
> initialize before entering Java main() method". This one is measured
> by tapping into VM internals.
>
> Aleksey's definition is "startup time is the time needed for VM to
> fully execute main() method one first time". This one is measured by
> standard Java benchmarks.
>
> My impression (backed by Cachegrind/Callgrind profiles) is, there much
> more time spent in initial compilation of Java bytecode than VM
> initialization sequence. This is also demonstrated by running real
> workload through startup scenario: even though there's the boost on
> HWA in "freezing-cold cache conditions", it disappears from startup of
> real workload.
>
> The VM initalization sequence can be a bottleneck but it would come to
> the effect when all other issues are resolved. IMO, the reduction of
> 0.1 msec startup time is neglible and does not worth messing the
> bootclasspath.
>
> Is there another advantages (not performance ones) for doing that? If
> not, I'd rather postpone it.
>
> I'm not saying Wenlong did bad job, he did great, but sometimes the
> consistency gains overcome the performance, especially is the
> performance boost is not general.
>
> Thanks,
> Aleksey.
>
> On Mon, Dec 22, 2008 at 7:58 AM, Wenlong Li <wenlong@gmail.com> wrote:
>> My startup means the computation needed before executing user's code
>> (in main method) (see [1][2], while  Aleksey's opinion is the startup
>> benchmark in SPECJVM2008.
>> [1] http://www.oracle.com/technology/pub/articles/dev2arch/2004/01/jrockit.html
>> [2] http://www.ibm.com/developerworks/java/library/os-ecspy1/
>>
>> On Mon, Dec 22, 2008 at 8:38 AM, Nathan Beyer <ndbeyer@apache.org> wrote:
>>> Can someone give a quick summary of the two different definitions of
>>> "startup" being discussed?
>>>
>>> -Nathan
>>>
>>> On Sun, Dec 21, 2008 at 6:22 PM, Wenlong Li <wenlong@gmail.com> wrote:
>>>> Aleksey,
>>>>
>>>> Thx for testing this patch, and sharing your experimental result.
>>>> Yes, I think your result would be reasonable. The performance gain of
>>>> this patch varies with different systems.
>>>>
>>>> Again, I would like to say we have different definitions for "startup".
>>>> Maybe I should move the change in classlib module to vm module, so
>>>> that the dependency can be minimized.
>>>>
>>>> thx again for discussion. :)
>>>> wenlong
>>>>
>>>> On Mon, Dec 22, 2008 at 4:04 AM, Aleksey Shipilev
>>>> <aleksey.shipilev@gmail.com> wrote:
>>>>> Hi Wenlong,
>>>>>
>>>>> I had some performance experiments with your patch. The test system is:
>>>>>  - Pentium D 820 2.8 Ghz / 2 Gb DDR2-667
>>>>>  - WD 3200KS, 320 Gb, 16 Mb cache
>>>>>  - Gentoo Linux x86, 2.6.23
>>>>>  - Harmony r728459
>>>>>  - SPECjvm2008
>>>>>
>>>>> To recreate the stressful conditions over and over the simple script
>>>>> was written [1]. The script invalidates the caches before actually
>>>>> starting the workload: re-reads the same 64 Mb file a couple of times
>>>>> to fill out on-HDD cache, invalidating VFS block caches first to make
>>>>> sure the data is really requested from the disk.
>>>>>
>>>>> On HWA [2] these performance results were produced:
>>>>>
>>>>> "cold-start" (invalidate caches):
>>>>> clean: (5.24 +- 0.28) secs
>>>>> ondemand: (4.49 +- 0.17) secs
>>>>>
>>>>> "warm-start" (don't invalidate caches);
>>>>> clean: (2.82 +- 0.01) secs
>>>>> ondemand: (2.80 +- 0.02) secs
>>>>>
>>>>> That is, on-demand patch does bring +17% (-+9%) improvement on HWA
>>>>> when running with flushed caches, and does not bring any performance
>>>>> improvement in warm mode.
>>>>>
>>>>> As I mentioned several times, this test does not reflect the real
>>>>> performance end user would perceive, so I took two SPECjvm2008:startup
>>>>> benchmarks and run each of them 10x10 times.
>>>>>
>>>>> SPECjvm2008:startup.helloworld, "cold start":
>>>>> clean: (8.93 +- 0.21) ops/min
>>>>> ondemand: (9.04 +- 0.03) ops/min
>>>>>
>>>>> SPECjvm2008:startup.compiler.compiler, "cold start":
>>>>> clean: (1.44 +- 0.05) ops/min
>>>>> ondemand: (1.42 +- 0.04) ops/min
>>>>>
>>>>> As you can see even in very stressful situation there's no boost. I
>>>>> would find these performance results unconvincing to change the
>>>>> infrastructure of boolclasspath resolution. Am I missing something
>>>>> important?
>>>>>
>>>>> Thanks,
>>>>> Aleksey.
>>>>>
>>>>> [1] run.sh
>>>>> #!/bin/bash
>>>>>
>>>>> R=`pwd`
>>>>>
>>>>> JAVA=$R/platforms/builds/harmony-release-clean/jdk/jre/bin/java
>>>>> #JAVA=$R/platforms/builds/harmony-release-ondemand/jdk/jre/bin/java
>>>>> JAVA_OPTS="-Xmx1024M -Xms1024M"
>>>>>
>>>>> for T in `seq 1 10`; do
>>>>>
>>>>>        echo "*************** EXECUTING ITERATION $T ****************"
>>>>>
>>>>>        # invalidate HDD caches
>>>>>        #   - need to replace all entries in LRU HDD cache
>>>>>        #   - flush the kernel VFS cache first to ensure the data
>>>>> would be read from disk
>>>>>
>>>>>        echo "Flushing caches"
>>>>>        for I in `seq 1 5`; do
>>>>>                sync
>>>>>                echo 3 > /proc/sys/vm/drop_caches
>>>>>
>>>>>                dd if=cachekiller.file of=/dev/null > /dev/null 2>&1
>>>>>        done
>>>>>
>>>>>        echo "Executing."
>>>>>
>>>>>        # HelloWorld
>>>>>        /usr/bin/time $JAVA $JAVA_OPTS -cp benchmarks/ HelloWorld 2>&1
>>>>>
>>>>>        # SPECjvm2008
>>>>>        #cd $R/benchmarks/storage/SPECjvm2008
>>>>>        #/usr/bin/time $JAVA $JAVA_OPTS -Djava.awt.headless=true -jar
>>>>> SPECjvm2008.jar -ikv -i 10 startup.compiler.compiler 2>&1
>>>>>
>>>>>        echo ""
>>>>> done
>>>>>
>>>>> [2] HelloWorld.java
>>>>> public class HelloWorld {
>>>>> public static void main(String[] args) {
>>>>>        System.out.println("Hello, world!");
>>>>> }
>>>>> }
>>>>>
>>>>>
>>>>> On Sun, Dec 21, 2008 at 6:02 AM, Wenlong Li <wenlong@gmail.com>
wrote:
>>>>>> On Sat, Dec 20, 2008 at 7:10 PM, Alexei Fedotov
>>>>>> <alexei.fedotov@gmail.com> wrote:
>>>>>>> Wenlong,
>>>>>>> Thanks for removing the commented code.
>>>>>>>
>>>>>>> There are several VMs which make use of the Harmony class library,
>>>>>>> e.g. Harmony VM, J9, Android Dalvik, etc. Your change is Harmony
VM
>>>>>>> specific, isn't it? If it is, then it's better to keep related
changes
>>>>>>> in the VM module. If it is not, then it might be a good idea
to keep
>>>>>>> the changes in the class library module unless other VMs already
has
>>>>>>> such optimization in their code.
>>>>>> [Wenlong] Though at this moment, you can think on-demand class parsing
>>>>>> is a specif optimization from your point of view. I believe it could
>>>>>> be a general technique, e.g., it can be easily deployed in other
>>>>>> runtime systems. Current VM also depends on the luniglobal.c in
>>>>>> working_classlib to get all class libraries/modules. e.g., there
is a
>>>>>> cross-module dependence between classlib and VM. When user wants
to
>>>>>> add new module, they should manually change the
>>>>>> bootclasspath.properties, while if applying this patch, user should
>>>>>> revise my added property file instead of the bootclasspath.properties.
>>>>>> I understand modifying bootclasspath file may be a specification.
>>>>>>>
>>>>>>> In any case crossing module boundary would make class library
users
>>>>>>> think more than once or even write some code. Is it technically
>>>>>>> possible to prepare a patch which does not change module boundaries?
>>>>>>> What do you think?
>>>>>> [Wenlong] Yes, it is possible from technical perspective, but a little
>>>>>> complicated. I can think about it. :)
>>>>>>
>>>>>>>
>>>>>>> As for your performance experiments, which particular test are
your
>>>>>>> measuring? It is bootclasspath-unpretentious "Hello, world",
isn't it?
>>>>>> [Wenlong] My startup means the work executed before running user's
>>>>>> computation. That is, the vm creation time. I manually add
>>>>>> instrumentation code for execution time in JNI_CreateJavaVM of
>>>>>> JNI.cpp. This startup work is common for any benchmarks. My experiment
>>>>>> was conducted on both Windows and Linux system. Please see my previous
>>>>>> message about performance gain from this optimization.
>>>>>>
>>>>>> Thx,
>>>>>> Wenlong
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> On Sat, Dec 20, 2008 at 2:19 AM, Wenlong Li <wenlong@gmail.com>
wrote:
>>>>>>>> On Sat, Dec 20, 2008 at 12:42 AM, Alexei Fedotov
>>>>>>>> <alexei.fedotov@gmail.com> wrote:
>>>>>>>>> Wenlong,
>>>>>>>>> Have I missed a discussion of the proposed design? I
see that you
>>>>>>>>> expose a new public interface:
>>>>>>>>>  /**
>>>>>>>>>  * @map the jar with exported package in the pending
jar list for
>>>>>>>>> on-demand jar parsing
>>>>>>>>>  *   Key is the jar, and value is the package exported
by this jar
>>>>>>>>>  */
>>>>>>>>> DECLARE_OPEN(void, vm_properties_set_pending_jar, (const
char* key,
>>>>>>>>> const char* value));
>>>>>>>>>
>>>>>>>>> Did you mean "Maps" instead of "@map"? Strangely the
word "pending"
>>>>>>>>> disappeared from the name of the wrapping VMI interface
>>>>>>>>> SetJarPackageMapping . Why should we extend both OPEN
and VMI
>>>>>>>>> interfaces with the same function? Why did you put your
code into
>>>>>>>>> working_classlib/modules/luni/src/main/native/luni/shared/luniglob.c,
>>>>>>>>> thus introducing another dependency between VM and class
library?
>>>>>>>> [Wenlong] The boot class path is defined in luniglobal.c
in Harmony,
>>>>>>>> and it also has dependence with VM. In my understanding,
my patch is
>>>>>>>> related to boot class path determination, so I also put my
code in
>>>>>>>> luniglobal.c, and use VMI interface to communicate with VM.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> +            //rcSetProperty = (*vmInterface)->SetJarPackageMapping
>>>>>>>>> (vmInterface, jarName, jarValue);
>>>>>>>>> +            /*
>>>>>>>>> +            hymem_free_memory(jarName);
>>>>>>>>> +            hymem_free_memory(jarValue);
>>>>>>>>> +            */
>>>>>>>>> Should we really commit the commented code?
>>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> [Wenlong] Please see my latest version of patch in the list.
Such
>>>>>>>> commented code has been removed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Dec 19, 2008 at 6:59 PM, Tim Ellison <t.p.ellison@gmail.com>
wrote:
>>>>>>>>>> I was hoping that somebody else would comment first,
so I don't have to
>>>>>>>>>> be the grumpy one all the time :-)
>>>>>>>>>>
>>>>>>>>>> As I said before, this is good prototyping work...
>>>>>>>>>>
>>>>>>>>>> Wenlong Li wrote:
>>>>>>>>>>> I did the pre-commit test on the patch of on-demand
class library
>>>>>>>>>>> parsing (https://issues.apache.org/jira/browse/HARMONY-6039),
and it
>>>>>>>>>>> works well now.
>>>>>>>>>>> Can Harmony incorporate this feature?
>>>>>>>>>>
>>>>>>>>>> I'm not sure it is ready for committing to the head
stream yet.
>>>>>>>>>>
>>>>>>>>>>> Via on-demand class parsing, we can reduce startup
time from 20+
>>>>>>>>>>> seconds to 3 seconds for cold runing, and 170
ms to 140 ms for warm-up
>>>>>>>>>>> running on Core 2 Duo with Windows.
>>>>>>>>>>
>>>>>>>>>> Can you tell me how to reproduce 20+sec cold start-up?
 I haven't seen
>>>>>>>>>> anything like that in my simple tests.
>>>>>>>>>>
>>>>>>>>>>> After applying the patch, please note there is
some change to add new modules.
>>>>>>>>>>> (1) If you want to add new modules/libraries,
please don't put them in
>>>>>>>>>>> the bootclasspath.properties file. This file
now only saves modules
>>>>>>>>>>> needed during startup (the VM startup only accesses
class libraries in
>>>>>>>>>>> eight modules)
>>>>>>>>>>
>>>>>>>>>> That would break too much.  How about creating a
new file rather than
>>>>>>>>>> re-purposing an existing file with different semantics?
 This file is
>>>>>>>>>> used by Jikes, IBM VME, the Eclipse plug-in, at least.
>>>>>>>>>>
>>>>>>>>>>> (2) For new modules/libraries, please put them
in the
>>>>>>>>>>> modulelibrarymapping.properties file. You should
specify the module
>>>>>>>>>>> name and its exported class library. Here is
one example:
>>>>>>>>>>> math.jar=java.math, where "math.jar" means the
module name, and
>>>>>>>>>>> "java.math" means the class libraries this module
exports.
>>>>>>>>>>
>>>>>>>>>> As we discussed on another thread, its unclear if
the time is spent in
>>>>>>>>>> following the slow indexing through the classpath/JAR
directories, or
>>>>>>>>>> whether it is speed of loading bytes once we know
what we need.  I think
>>>>>>>>>> that it is premature to abandon the JAR manifest
data as the principal
>>>>>>>>>> source of metadata until we understand the problem
this solves.
>>>>>>>>>>
>>>>>>>>>> Can we measure where the time is spent in the current
implementation?
>>>>>>>>>> I think it will help guide this approach to a better
solution.
>>>>>>>>>> What tools do you recommend for profiling start-up?
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> Tim
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> С уважением,
>>>>>>>>> Алексей Федотов,
>>>>>>>>> ЗАО «Телеком Экспресс»
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> С уважением,
>>>>>>> Алексей Федотов,
>>>>>>> ЗАО «Телеком Экспресс»
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Mime
View raw message