[ https://issues.apache.org/jira/browse/HARMONY5826?page=com.atlassian.jira.plugin.system.issuetabpanels:commenttabpanel&focusedCommentId=12638719#action_12638719 ]
Xiaoming Gu commented on HARMONY5826:

I find the benefits of this patch come from changing a 64bit MUL (3 32bit MUL and 2 32bit ADD) to a 32bit MUL. At present, the optimization is done as a magic replacement, which is not a common way to generate code.
Assume A and B are both 64bit operands and we are doing A*B. In 32bit machine, the MUL operation is usually translated to (High 32bit of A)*(Low 32bit of b)+(Low 32bit of A)*(High 32bit of B)+(Low 32bit of A)*(Low 32bit of A). But when we know High 32bit of A and B are both 0, only (Low 32bit of A)*(Low 32bit of A) needed.
Following are the HIR and LIR for result = (a & ffffffffL) * (b & ffffffffL) + (c & ffffffffL) + (d & ffffffffL) without this patch. We can do the optimization in HIR simplifier or LIR peephole. I'm not sure whether changing int64 operation to int32 operation will bring overhead for 64bit machine. I think maybe peephole is a better place. I find there is no peephole optimization for XOR. If JIT could find out the result of XOR is 0, then propagates the 0 to MUL and related MUL is eliminated. The left problem is whether there is sufficient dataflow analysis in LIR to do the propagation and elimination.
=====HIR=====
I42:ldci8 #4294967295 ) t38:int64
I43:and t37, t38 ) t39:int64
I44:convi8 g23 ) t40:int64
I45:and t40, t38 ) t41:int64
I46:mul t39, t41 ) t42:int64
=====LIR=====
238B02A6 I329: MOV s286[v208(ESP)+t285(100)]:I_32,v19(EAX):I_32
238B02AA I328: MOV t291[v208(ESP)+t290(104)]:I_32,v19(EAX):I_32
238B02AE I327: (ID:s8(EFLGS):U_32) =XOR t206(EDX):I_32,t206(EDX):I_32
238B02B0 I326: MOV s292(EAX):I_32,s286[v208(ESP)+t285(100)]:I_32
238B02B4 I70: (ID:s8(EFLGS):U_32) =MUL s139(EDX):I_32,s292(EAX):I_32,t206(EDX):I_32
238B02B6 I325: MOV s140(EBX):I_32,s292(EAX):I_32
238B02B8 I324: (ID:s8(EFLGS):U_32) =XOR s292(EAX):I_32,s292(EAX):I_32
238B02BA I323: MOV t289(EDI):I_32,v217[v208(ESP)+t216(24)]:I_32
238B02BE I73: (ID:s8(EFLGS):U_32) =MUL s139(EDX):I_32,s292(EAX):I_32,t289(EDI):I_32
238B02C0 I74: (ID:s8(EFLGS):U_32) =ADD s140(EBX):I_32,s292(EAX):I_32
238B02C2 I322: MOV v19(EAX):I_32,t291[v208(ESP)+t290(104)]:I_32
DEADBEEF I75: (AD:s293(EAX):I_32) =CopyPseudoInst/MOV (AU:v19(EAX):I_32)
238B02C6 I76: (ID:s8(EFLGS):U_32) =MUL s139(EDX):I_32,s293(EAX):I_32,t289(EDI):I_32
238B02C8 I321: MOV s286[v208(ESP)+t285(100)]:I_32,s293(EAX):I_32
238B02CC I77: (ID:s8(EFLGS):U_32) =ADD s139(EDX):I_32,s140(EBX):I_32
Any comments? Thanks.
> [drlvm][jit][opt][performance] Magic for java.math.Multiplication.unsignedMultAdd2
> 
>
> Key: HARMONY5826
> URL: https://issues.apache.org/jira/browse/HARMONY5826
> Project: Harmony
> Issue Type: Improvement
> Components: DRLVM
> Reporter: Aleksey Shipilev
> Attachments: H5826V2.patch, vmjitmathunsignedMultAdd2magicrc1.patch
>
>
> Implementation of magic for java.math.Multiplication.unsignedMultAdd2, extracted in HARMONY5825.

This message is automatically generated by JIRA.

You can reply to this email to add a comment to the issue online.