harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pasko <egor.pa...@gmail.com>
Subject [drlvm][jit][opt] HARMONY-3243 It's possible to fill array with constant much faster
Date Thu, 15 Mar 2007 16:58:54 GMT
JIT guys,

..on the HARMONY-3243 patch I would like an open discussion in dev@
rather than a limited one in JIRA.

first of all, the latest faf_em64t.patch crashes for me with the message:

java: /home/xxx/svn/1/trunk/working_vm/vm/jitrino/src/shared/LoopTree.h:85: bool Jitrino::LoopTree::hasLoops()
const: Assertion `isValid()' failed.
SIGABRT in VM code.
Stack trace:
addr2line: '[vdso]': No such file
(hangs here)

next, with all great respect to Nikolay Sidelnikov for his efforts I
think, this idea needs a more careful analysis and probably
reimplementation.

The story is: 2 new optimization passes introduced: one in High-level
Optimizer nd one in Code Generator:
pass.1 recognizes a simple loop initialization with a simple pattern,
       substitutes it with a JIT helper call
pass.2 takes the corresponding call and substitutes it with a
       Low-Level-IR as it thinks the best code is

this should hypothetically improve one simple code pattern (that is
probably widely used?), i.e.: for (int i=0;i<I;i=++){A[i]=X;}

What I figured out looking at the patch:

* [pass.2] does not seem to throw any AIOutOfBoundsException

* [pass.2] does not have any useful tuning parameters such as number
           of unrolls per loop, thus the scheme eats potential benefit
           from loop unrolling and does not give any tuning back

* [pass.1] detects such a rare pattern that I doubt it would benefit a
           user (but obviously will benefit a you-know-which-benchmark
           runner)

* [pass.1] has a lot of new code that introduces potential instability
           (if the pattern was detected not properly, the code does
           not read easily), but does not contain a single unit test
           or the like. Together with AIOOBE issue stability becomes a
           real question.

* back branch polling is not performed (which is probably good for
  performance, but I would better have a tuning option)

What I can say more is that a good "ABCD" optimization complimented
with "loop versioning" optimiztion will make a more readable, more
stable code, AND will give a better performance gain (loop unrolling
is awake too). Setting aside the fact that the overall design will be
more straightforward (having no interdependent passes, extra helpers, etc)

So I vote for focusing on ABCD plus "loop versioning" and leaving
specific benchmark-oriented tricks (complicating our design) alone.

An experienced hacker would say that all compiler reasearch is a bunch
of hacks that influence each other in unexpected ways. Well, maybe,
but I do not like this particular hack (with all respect to Nikolay
for his efforts)

-- 
Egor Pasko


Mime
View raw message