Aleksey Shipilev wrote:
> Hi, guys!
>
> I have run NBody test on Harmony and make some hacks for sqrt
> implementation. There are also clockticks distribution while running
> on Windows XP SP2.
Thanks Aleksey!
> As baseline I have used :
> 0. Sun 1.6.0_02 (server): 1200 msecs
> 100% Other32
>
> 1. Clean Harmony (Xem:server): 23500 msecs
> 80% hyluni.dll:__ieee754_sqrt().
> 11% Other32
> 6% harmonyvm.dll
>
> 2. After stubbing sqrt() call with intrinsic [1]: 5300 msecs
> 40% Other32
> 29% hyluni.dll:internal_sqrt() and Java_java_lang_Math_sqrt()
> 20% harmonyvm.dll: serving native calls
>
> 3. After inlining internal_sqrt() [2]: 5000 msecs
> 45% Other32
> 23% hyluni.dll:internal_sqrt() and Java_java_lang_Math_sqrt()
> 20% harmonyvm.dll: serving native calls
>
> 4. After applying JNI transition improvements [3]: 4700 msecs
> 50% Other32
> 25% hyluni.dll:internal_sqrt() and Java_java_lang_Math_sqrt()
> 10% harmonyvm.dll: serving native calls
>
> It seems to me that it could be improved further if some magic
> implementing sqrt() will be used instead on native call.
Looks like we have to go this path, since the hacked intrinsics are
still 4x slower if I'm reading this properly.
> Moreover, AFAIU the (3) approach is safe since IEEE754 compatibility
> must be preserved only for strict mode, whereas (3) approach
> implements fastpath for nonstrict mode.
Yes, I modified my microbench to use both strict and nonstrict in the
same run and there is a noticeable difference on Sun 6.0:
Math Result = 6.666661664588418E8 in 30ms
StrictMath Result = 6.666661664588418E8 in 1012ms
Regards,
Tim
