harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pasko <egor.pa...@gmail.com>
Subject Re: [drlvm][jit][opt] HARMONY-3243 It's possible to fill array with constant much faster
Date Thu, 15 Mar 2007 17:19:22 GMT
On the 0x29A day of Apache Harmony Naveen Neelakantam wrote:
> On Mar 15, 2007, at 11:58 AM, Egor Pasko wrote:
> 
> > JIT guys,
> >
> > ..on the HARMONY-3243 patch I would like an open discussion in dev@
> > rather than a limited one in JIRA.
> >
> > first of all, the latest faf_em64t.patch crashes for me with the
> > message:
> >
> > java: /home/xxx/svn/1/trunk/working_vm/vm/jitrino/src/shared/
> > LoopTree.h:85: bool Jitrino::LoopTree::hasLoops() const: Assertion
> > `isValid()' failed.
> > SIGABRT in VM code.
> > Stack trace:
> > addr2line: '[vdso]': No such file
> > (hangs here)
> >
> > next, with all great respect to Nikolay Sidelnikov for his efforts I
> > think, this idea needs a more careful analysis and probably
> > reimplementation.
> >
> > The story is: 2 new optimization passes introduced: one in High-level
> > Optimizer nd one in Code Generator:
> > pass.1 recognizes a simple loop initialization with a simple pattern,
> >        substitutes it with a JIT helper call
> > pass.2 takes the corresponding call and substitutes it with a
> >        Low-Level-IR as it thinks the best code is
> >
> > this should hypothetically improve one simple code pattern (that is
> > probably widely used?), i.e.: for (int i=0;i<I;i=++){A[i]=X;}
> >
> > What I figured out looking at the patch:
> >
> > * [pass.2] does not seem to throw any AIOutOfBoundsException
> >
> > * [pass.2] does not have any useful tuning parameters such as number
> >            of unrolls per loop, thus the scheme eats potential benefit
> >            from loop unrolling and does not give any tuning back
> >
> > * [pass.1] detects such a rare pattern that I doubt it would benefit a
> >            user (but obviously will benefit a you-know-which-benchmark
> >            runner)
> >
> > * [pass.1] has a lot of new code that introduces potential instability
> >            (if the pattern was detected not properly, the code does
> >            not read easily), but does not contain a single unit test
> >            or the like. Together with AIOOBE issue stability becomes a
> >            real question.
> >
> > * back branch polling is not performed (which is probably good for
> >   performance, but I would better have a tuning option)
> >
> > What I can say more is that a good "ABCD" optimization complimented
> > with "loop versioning" optimiztion will make a more readable, more
> > stable code, AND will give a better performance gain (loop unrolling
> > is awake too). Setting aside the fact that the overall design will be
> > more straightforward (having no interdependent passes, extra
> > helpers, etc)
> 
> As for the ABCD end of things, you already know that I have completed
> your reimplementation of ABCD.  We might need to make the pass run
> faster (for example, by focusing on hot bounds checks), 

this is easy, BTW :)

> but otherwise its finished.

thanks! And I really love this fact.

-- 
Egor Pasko


Mime
View raw message