harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Naveen Neelakantam <neela...@uiuc.edu>
Subject Re: [drlvm][jit][opt] HARMONY-3243 It's possible to fill array with constant much faster
Date Thu, 15 Mar 2007 17:06:08 GMT

On Mar 15, 2007, at 11:58 AM, Egor Pasko wrote:

> JIT guys,
>
> ..on the HARMONY-3243 patch I would like an open discussion in dev@
> rather than a limited one in JIRA.
>
> first of all, the latest faf_em64t.patch crashes for me with the  
> message:
>
> java: /home/xxx/svn/1/trunk/working_vm/vm/jitrino/src/shared/ 
> LoopTree.h:85: bool Jitrino::LoopTree::hasLoops() const: Assertion  
> `isValid()' failed.
> SIGABRT in VM code.
> Stack trace:
> addr2line: '[vdso]': No such file
> (hangs here)
>
> next, with all great respect to Nikolay Sidelnikov for his efforts I
> think, this idea needs a more careful analysis and probably
> reimplementation.
>
> The story is: 2 new optimization passes introduced: one in High-level
> Optimizer nd one in Code Generator:
> pass.1 recognizes a simple loop initialization with a simple pattern,
>        substitutes it with a JIT helper call
> pass.2 takes the corresponding call and substitutes it with a
>        Low-Level-IR as it thinks the best code is
>
> this should hypothetically improve one simple code pattern (that is
> probably widely used?), i.e.: for (int i=0;i<I;i=++){A[i]=X;}
>
> What I figured out looking at the patch:
>
> * [pass.2] does not seem to throw any AIOutOfBoundsException
>
> * [pass.2] does not have any useful tuning parameters such as number
>            of unrolls per loop, thus the scheme eats potential benefit
>            from loop unrolling and does not give any tuning back
>
> * [pass.1] detects such a rare pattern that I doubt it would benefit a
>            user (but obviously will benefit a you-know-which-benchmark
>            runner)
>
> * [pass.1] has a lot of new code that introduces potential instability
>            (if the pattern was detected not properly, the code does
>            not read easily), but does not contain a single unit test
>            or the like. Together with AIOOBE issue stability becomes a
>            real question.
>
> * back branch polling is not performed (which is probably good for
>   performance, but I would better have a tuning option)
>
> What I can say more is that a good "ABCD" optimization complimented
> with "loop versioning" optimiztion will make a more readable, more
> stable code, AND will give a better performance gain (loop unrolling
> is awake too). Setting aside the fact that the overall design will be
> more straightforward (having no interdependent passes, extra  
> helpers, etc)

As for the ABCD end of things, you already know that I have completed  
your reimplementation of ABCD.  We might need to make the pass run  
faster (for example, by focusing on hot bounds checks), but otherwise  
its finished.

Geir needs to give me the go ahead and I will upload a patch.

Naveen




Mime
View raw message