db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@gmail.com>
Subject Re: [jira] Commented: (DERBY-2130) Optimizer performance slowdown from 10.1 to 10.2
Date Wed, 13 Dec 2006 16:30:53 GMT
> Bryan Pendleton commented on DERBY-2130:
> ----------------------------------------
> With jumpReset.patch applied, I cannot reproduce the varying optimize times.
> The times are all in a tight range, after nearly 6x as many tests as produced
> the variable optimize times before.

That's good to hear.  Thank you for taking the time to re-run your tests.

I'm curious: what is the optimization time that you consistently see with 
jumpReset.patch?  Is it the lesser time (i.e 170-200 seconds) or the greater 
time (500-650s)?  Or something else entirely?

Given that jumpReset.patch appears to solve the variance problem, we are then 
left with the increased optimization time vs 10.1, which as I mentioned earlier 
is indirectly caused by the DERBY-1357 fix.  And as I proposed earlier, I think 
the DERBY-1357 fix is itself correct; the "problem" (I use the term loosely) is 
with the following if-block in OptimizerImpl.java:

     if (permuteState == JUMPING && !joinPosAdvanced && joinPosition >=
         //not feeling well in the middle of jump
         // Note: we have to make sure we reload the best plans
         // as we rewind since they may have been clobbered
         // (as part of the current join order) before we gave
         // up on jumping.
         reloadBestPlan = true;
         rewindJoinOrder(); //fall
         permuteState = NO_JUMP; //give up

While just removing this check reduces the optimization time back to what it was 
for 10.1, that in itself is not a complete solution.  The reason is that simple 
removal of this if-block can in certain cases lead to an infinite loop.  As an 
example, when I removed the if-block and then tried running lang/innerjoin.sql, 
the test never finished; instead it hung due to an infinite "JUMPING" loop.  So 
some additional changes would be required.

That said, I would like to emphasize that the underlying problem here seems to 
be DERBY-1905.  Even if we get 10.2 to run repro.sql as 'quickly' as 10.1, it 
still takes 10.1 way too long to optimize the query (90 seconds on a "fairly 
powerful Windows machine", according to the description for DERBY-2130), esp. 
given that there are no rows in any of the tables.  The reason is because the 
cost estimates are too high (infinity) and thus timeout does not take effect 
until too late.

I have been piddling around with DERBY-1905 on and off and early experimentation 
shows that if the cost estimates are more reasonable, the repro.sql script 
attached to DERBY-2130 completes in 3 or 4 seconds.  And that's the case even if 
we leave the above if-block exactly as it is right now (i.e. we do *not* remove 
it).  So this seems to confirm that aside from the variance problem, DERBY-2130 
is in some ways an expression of DERBY-1905.

Note, though, that it *may* be too risky to port changes for DERBY-1905 back to 
10.2 (I don't know for sure since I don't know what those changes will 
ultimately be).  Thus it might still be worth it to investigate the 
aforementioned if-block angle for the sake of addressing the performance 
regression seen between 10.1 and 10.2.  I.e. to specifically address the 10.2 
slowdown filed as DERBY-2130.


View raw message