harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pasko <egor.pa...@gmail.com>
Subject Re: [drlvm][jit][ia-32]register-based fast calling convention
Date Thu, 16 Nov 2006 13:54:07 GMT
On the 0x222 day of Apache Harmony Alex Astapchuk wrote:
> Hi all,
> 
> Among other things listed on the JIT Dev tasks, there is a need for
> calling convention (CC) fix-up for IA-32 [1].
> 
> Current problems are:
> 
> 1. The calling convention(s) used are stack-based - this adds a memory
> access overhead on calls.
> 2. The convention currently used for managed code neither allow to pass
> float-point values on XMM registers, nor it provides callee-saved XMM
> registers.
> 3. FPU stack is used to return float/double values
> 
> 
> Both 2) and 3) affect register allocation for float point values in a
> bad manner.
> Fixing even the 1) looks promising for hot vm helpers like monitor
> enter/exit and resolve_interface_vtable.
> 
> So, I'm going to implement register-based calling convention for IA-32.
> 
> The current proposal is:
>      - make it possible to switch between existing and new conventions
> 	for investigation and tuning purposes
>      - implement 2 calling conventions:
> 	1. well known standard fastcall (fisrt 2 params on ECX+EDX, the
> 	rest is on stack)
> 	2. DRLVM-specific convention: which involves ECX, EDX (and may
> 	be EAX) for integer/parameters passing and also use XMMs for
> 	float-point parameters and produce callee-save XMMs.
> 	
> The #1 may be used to call internal C-based helpers. It may also be used
> to call VM helpers where XMM callee save regs may add unnecessary
> overhead on the helper itself. The example I can think of is
> resolve_interface helper - preserving XMMs there looks overkill.

Alex, is there some mechanism to annotate helpers' with calling
conventions that you would prefer? Or are you going to hardcode

> #2 will help to speed-up managed code both call-intensive and (I hope)
> FP-intesive - together with register allocator tuning.

I would REALLY love to see it implemented!! It is a long-awaited
performance feature. FP performance of DRLVM is poor if compared to
HotSpot, and the most probable reason for that is problem-2 above.
A microbenchmark would be great to have. I would be also happy to see
"the whole design proposal" here in the mailing list. Is it possible?

> 
> Any comments are welcome.
> 
> 
> [1]
> http://wiki.apache.org/harmony/JIT_Development_Tasks#head-bffdfbc80108641ca9a8bc29ea871c67fb3b82b9
> 
> 
> -- 
> Thanks,
>    Alex
> 
> 

-- 
Egor Pasko


Mime
View raw message