harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Astapchuk <alex.astapc...@gmail.com>
Subject Re: [drlvm][jit][ia-32]register-based fast calling convention
Date Tue, 21 Nov 2006 09:33:00 GMT
Egor Pasko wrote:
> On the 0x224 day of Apache Harmony Alex Astapchuk wrote:
>> Hi Egor,
>>
>> Thanks for your reply. Please, find my answers inlined.
>>
>> Egor Pasko wrote:
>>> On the 0x222 day of Apache Harmony Alex Astapchuk wrote:
>>>> Hi all,
>>>>
>>>> Among other things listed on the JIT Dev tasks, there is a need for
>>>> calling convention (CC) fix-up for IA-32 [1].
>>>>
>>>> Current problems are:
>>>>
>>>> 1. The calling convention(s) used are stack-based - this adds a memory
>>>> access overhead on calls.
>>>> 2. The convention currently used for managed code neither allow to pass
>>>> float-point values on XMM registers, nor it provides callee-saved XMM
>>>> registers.
>>>> 3. FPU stack is used to return float/double values
>>>>
>>>>
>>>> Both 2) and 3) affect register allocation for float point values in a
>>>> bad manner.
>>>> Fixing even the 1) looks promising for hot vm helpers like monitor
>>>> enter/exit and resolve_interface_vtable.
>>>>
>>>> So, I'm going to implement register-based calling convention for IA-32.
>>>>
>>>> The current proposal is:
>>>>      - make it possible to switch between existing and new conventions
>>>> 	for investigation and tuning purposes
>>>>      - implement 2 calling conventions:
>>>> 	1. well known standard fastcall (fisrt 2 params on ECX+EDX, the
>>>> 	rest is on stack)
>>>> 	2. DRLVM-specific convention: which involves ECX, EDX (and may
>>>> 	be EAX) for integer/parameters passing and also use XMMs for
>>>> 	float-point parameters and produce callee-save XMMs.
>>>> 	
>>>> The #1 may be used to call internal C-based helpers. It may also be used
>>>> to call VM helpers where XMM callee save regs may add unnecessary
>>>> overhead on the helper itself. The example I can think of is
>>>> resolve_interface helper - preserving XMMs there looks overkill.
>>> Alex, is there some mechanism to annotate helpers' with calling
>>> conventions that you would prefer? Or are you going to hardcode
>> Agh... Good question. And I don't have the right answer now.
>>
>> I'm going to make the switch between old and new conventions
>> controllable from the command line, but that's almost all I can do in
>> current environment.
> 
> That would be good. Need to keep versions of compiled native calls for
> various calling conventions, heh?
> 
>> The heplers infos like signatures and calling conventions used
>> is quite-long-head-ache history.
> 
> :)
> 
>> What I'm going to implement is quite orthogonal to how the info may be
>> passed between VM and JIT. I'm only going to support the possibility
>> of calling convention usage.
> 
> ....yes, this is orthogonal, but needs TBD for configurability and
> completeness of the design solution. We can return to this as soon as
> your performance experiments show up.
> 
>> The helpers infos may be related with Mikhail's work with helpers inlining.
>> I recall some discussions about Java-based annotations that may be
>> used to describe helpers (not only for inlining, but in general),
>> including convention used, the library/module location, etc.
> 
> again, we should collect the approaches and decide
> 
>>>> #2 will help to speed-up managed code both call-intensive and (I hope)
>>>> FP-intesive - together with register allocator tuning.
>>> I would REALLY love to see it implemented!! It is a long-awaited
>>> performance feature. FP performance of DRLVM is poor if compared to
>>> HotSpot, and the most probable reason for that is problem-2 above.
>>> A microbenchmark would be great to have. I would be also happy to see
>>> "the whole design proposal" here in the mailing list. Is it possible?
>> Sure, I'll do the micro benchmark.
>>
>> I don't have a design proposal - since I'm not going to change design.
>> I'm only extend existing functionality a bit.
> 
> By "design proposal" I mean something you should not be afraid of. The
> list of tuning parameters and that kind of stuff.

Well, the tuning parameters I was thinking about could be java 
properties. Something like

; whether managed code uses 'fast' calling convention or not
vm.managed_cc=drlfast|default
; whether to use FPU to return float point from methods
vm.fastcc.use_fpu_ret=true|false
; the rest of names are quite self-descriptive
vm.fastcc.num_xmm_args_regs=<a number of SSE registers for args>
vm.fastcc.num_xmm_calleesave=<a number of SSE registers used as callee-save>
vm.fastcc.num_gp_args_regs=<0-3 or symbolic names like eax,ecx,edx>

with
	use_fpu_ret=true
	num_xmm_args_regs=0
	num_xmm_calleesave=0
	num_gp_args_regs=0

the 'drlfast' turns into the currently used default convention for IA-32.

-- 
Thanks,
   Alex

	
>> Mostly the requirement's I'm going to meet are described in my answer
>> to Rana - there are things that will be tunable there.
> 




Mime
View raw message