harmony-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Fursov (JIRA)" <j...@apache.org>
Subject [jira] Closed: (HARMONY-2171) [drlvm][jit] Loop Unrolling for private method java.util.TreeMap.successor().
Date Tue, 17 Jul 2007 11:08:05 GMT

     [ https://issues.apache.org/jira/browse/HARMONY-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mikhail Fursov closed HARMONY-2171.
-----------------------------------

    Resolution: Invalid

Sergey, I'm closing this issue as Invalid and propose you creating new "feature request" for
memory prefetching with details.

> [drlvm][jit] Loop Unrolling for private method java.util.TreeMap.successor().
> -----------------------------------------------------------------------------
>
>                 Key: HARMONY-2171
>                 URL: https://issues.apache.org/jira/browse/HARMONY-2171
>             Project: Harmony
>          Issue Type: Improvement
>          Components: DRLVM
>         Environment: IA-32
>            Reporter: Sergey Kuksenko
>            Assignee: Mikhail Fursov
>            Priority: Minor
>
> Method  java.util.TreeMap::successor is hot method for iterations over TreeMap (and some
others operations).
> Generated code for the method may be improved.
> Java code of the method:
>     static TreeMapEntry successor(TreeMapEntry x) {
>         if (x.right != null)
>             return minimum(x.right);
>         TreeMapEntry y = x.parent;
>         while (y != null && x == y.right) {
>             x = y;
>             y = y.parent;
>         }
>         return y;
>     }
> Also method "minimum" is usualy inlined. minimum's code:
>     static TreeMapEntry minimum(TreeMapEntry x) {
>         while (x.left != null)
>             x = x.left;
>         return x;
>     }
> The generated asm code the follwoing:
> line address    code                    
> 1       B7EE60: push        ebx
> 2       B7EE61: push        ebp
> 3       B7EE62: mov         eax,dword ptr [esp+0Ch]
> 4       B7EE66: mov         ebp,eax
> 5       B7EE68: mov         ebx,dword ptr [eax+14h]
> 6       B7EE6B: test        ebx,ebx
> 7       B7EE6D: je          00B7EEA8
> 8       B7EE73: mov         ebp,dword ptr fs:[00000014h]
> 9       B7EE7A: mov         eax,ebx
> 10      B7EE7C: mov         ebx,dword ptr [eax+18h]
> 11      B7EE7F: test        ebx,ebx
> 12      B7EE81: je          00B7EEF2
> 13      B7EE87: mov         ebp,dword ptr fs:[00000014h]
> 14      B7EE8E: cmp         dword ptr [ebp+00000270h],0
> 15      B7EE98: je          00B7EE7A
> 16      B7EE9E: call        00A96850
> 17      B7EEA3: jmp         00B7EE7A
> 18      B7EEA8: mov         ebx,dword ptr [eax+1Ch]
> 19      B7EEAB: mov         eax,dword ptr fs:[00000014h]
> 20      B7EEB2: mov         eax,ebx
> 21      B7EEB4: test        eax,eax
> 22      B7EEB6: je          00B7EEED
> 23      B7EEBC: cmp         ebp,dword ptr [eax+14h]
> 24      B7EEBF: jne         00B7EEED
> 25      B7EEC5: mov         ebp,eax
> 26      B7EEC7: mov         eax,dword ptr [eax+1Ch]
> 27      B7EECA: mov         ebx,eax
> 28      B7EECC: mov         eax,dword ptr fs:[00000014h]
> 29      B7EED3: cmp         dword ptr [eax+00000270h],0
> 30      B7EEDD: je          00B7EEB2
> 31      B7EEE3: call        00A96850
> 32      B7EEE8: jmp         00B7EEB2
> 33      B7EEED: pop         ebp
> 34      B7EEEE: pop         ebx
> 35      B7EEEF: ret         4
> 36      B7EEF2: pop         ebp
> 37      B7EEF3: pop         ebx
> 38      B7EEF4: ret         4
> -------------------
> Unrolling of small cycles by factor 4 or 8 will decrease BBP impact.
> Unrolling this cycle by factor 2 allows making more effective register usage.
> The original loop in lines 9-17:
> loop:   mov     eax,ebx
>         mov     ebx,dword ptr [eax+18h]
>         test    ebx,ebx
>         je      exit
>         mov     ebp,dword ptr fs:[00000014h]
>         cmp     dword ptr [ebp+00000270h],0
>         je      loop
>         call    suspend
>         jmp     loop
>         ....
> exit:                             ; result already in eax.
>         pop ebp
>         pop ebx
>         ret 4
> At this cycle variable x is located at eax register. x.left is loaded into ebx
> register and for each iteration we should move ebx to eax. If this cycle will
> be unroll at least by factor 2 we may change register assignment. For the first
> iteration store x in ebx and load x.left to eax and for second iteration store
> x in eax and load x.left to ebx. After such transformation instruction mov eax,
> ebx at each cycle became useless. 
> Let's move loading TLS address out of cycle and unroll this cycle by 4.
>         mov     ebp,dword ptr fs:[00000014h]
>         ; at this point ebx contains x
> loop:
>         mov     eax,dword ptr [ebx+18h]     ;     eax = x.left
>         test    eax,eax
>         je      exit1
>         mov     ebx,dword ptr [eax+18h]     ;     ebx = x.left.left
>         test    ebx,ebx
>         je      exit
>         mov     eax,dword ptr [ebx+18h]     ;     eax = x.left.left.left
>         test    eax,eax
>         je      exit1
>         mov     ebx,dword ptr [eax+18h]
>         test    ebx,ebx
>         je      exit
>         cmp     dword ptr [ebp+00000270h],0
>         je      loop
>         call    suspend
>         jmp     loop
>         ....
> exit1:
>         mov eax, ebx     ; here we should move result from ebx to eax.
> exit:                               ; here the result already eax.
>         pop ebp
>         pop ebx
>         ret 4
> After such unrolling will have 3.5 instructions per iteration instead of 7
> instructions per iteration. 
> Such unrolling will be very useful for cycles like:
> while (x!=null) {
>         ...
>         x = x.next;
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message