harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pasko <egor.pa...@gmail.com>
Subject Re: [drlvm][jit][opt] HARMONY-3243 It's possible to fill array with constant much faster
Date Fri, 16 Mar 2007 12:45:44 GMT
On the 0x29A day of Apache Harmony Mikhail Fursov wrote:
> On 15 Mar 2007 19:58:54 +0300, Egor Pasko <egor.pasko@gmail.com> wrote:
> >
> > this should hypothetically improve one simple code pattern (that is
> > probably widely used?), i.e.: for (int i=0;i<I;i=++){A[i]=X;}
> >
> > What I figured out looking at the patch:
> >
> > * [pass.2] does not seem to throw any AIOutOfBoundsException
> >
> > * [pass.2] does not have any useful tuning parameters such as number
> >            of unrolls per loop, thus the scheme eats potential benefit
> >            from loop unrolling and does not give any tuning back
> AFAIK this optimization is much more efficient then loop unrolling on
> microtest you mentioned.

I believe, loop unrolling can do more

> * [pass.1] detects such a rare pattern that I doubt it would benefit a
> >            user (but obviously will benefit a you-know-which-benchmark
> >            runner)
> Generic Arrays.fill-like methods could be optimized this way.
> + IMO even several percents in widely known
> benchmarks is a reason to implement even more complicated optimizations.
> * [pass.1] has a lot of new code that introduces potential instability
> >            (if the pattern was detected not properly, the code does
> >            not read easily), but does not contain a single unit test
> >            or the like. Together with AIOOBE issue stability becomes a
> >            real question.
> All known bugs can be fixed. 

yes, and the fix might take so much time that implementing
versioning+abcd would be faster. Noone knows. 

I thought, stability has a higher priority that 0.5% improving
error-prone hacks..

On the other way, all JIT technology is just an error-prone way of
optimizing an interpreter :)

> If AIOOBE is the a real problem here - it looks to be easily fixed
> too. The question if the optimization gives any benefit or not.

Well, I agree. So, let's write tests on this first and let the patch
go then.

> We can move it into separate  HLO pass (and separate
> file) and drop it from codebase if it's not needed in future.

I love this idea

> * back branch polling is not performed (which is probably good for
> >   performance, but I would better have a tuning option)
> Do you think that the latency of mem-copying like opt can be a problem here?

may be, not obvious

> What I can say more is that a good "ABCD" optimization complimented
> > with "loop versioning" optimiztion will make a more readable, more
> > stable code, AND will give a better performance gain (loop unrolling
> > is awake too). Setting aside the fact that the overall design will be
> > more straightforward (having no interdependent passes, extra helpers, etc)
> So I vote for focusing on ABCD plus "loop versioning" and leaving
> > specific benchmark-oriented tricks (complicating our design) alone.
> I support focusing on loop
> versioning/ABCD and other general purpose optimization we do not have today.
> And until we do not get from these opts better results for your microtest we
> can use Nikolay's approach. At least it's works better today.
> ?

Okay, we can. But, IMHO, Nikolay could have written the right way from
the beginning. That did not happen. I am not worried too much,
though. Now we have the patch and with some more testing it should
just work.

Egor Pasko

View raw message