db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@gmail.com>
Subject Re: [jira] Commented: (DERBY-1357) Short-circuit logic in optimizer appears to be incorrect...
Date Fri, 14 Jul 2006 19:23:38 GMT
Thank you for reviewing this patch, Bryan.  My comments below.

> One question, though: what is the actual symptom of this bug? 
> If I am understanding it correctly, the symptom is that the 
> optimizer may pointlessly continue to investigate a possible 
> query plan which it should already be able to reject as being
> too expensive.

Yes, this is correct.

> Does that mean that the only symptom here is that the optimizer
> may waste some resources?

Directly, yes.  But indirectly, this could affect the the query plan that 
optimizer chooses because of optimizer "timeout".  The more time the optimizer 
wastes optimizing join orders that it should already be able to reject, the 
fewer join orders it's going to see before timeout occurs.  The patch for 
DERBY-1357 avoids the wasted time, thereby allowing the optimizer to 
(potentially) see more join orders before timeout occurs, and thus potentially 
allowing the optimizer to find a better plan.

That said, you may be wondering why I didn't add any new tests with this patch. 
  The answer is that the only time this patch will show a difference in query 
plan is if timeout is enabled (as explained above), but as I've learned from 
other tests already in the harness (esp. wisconsin, predicatePushdown, 
subquery), a test that enables timeout and then prints query plans is bound to 
fail (usually intermittently) on one system or another (because timeout is 
actually based on milliseconds and the amount of work that the optimizer can do 
in xxx milliseconds depends a lot on what machine is being used).  So adding a 
test case with timeout disabled won't show the symptom, and adding it with 
timeout enabled will likely lead to intermittent failures on different machines. 
  Hence no new tests.

The diff that occurs for predicatePushdown is an admittedly accidental way to 
see that the patch is in fact short-circuiting plans (see above comment on the 
master update), but other than that I'm not sure how else to measure the symptom.

> I guess the thing I'm still struggling with is the bit of the 
> change which adds "reloadBestPlan = true" at about line 530. 
> Does that part of the patch have any actual user-visible impact 
> on which query plan would be chosen, 

Great question.  Timeout effects aside, the plan that the optimizer *chooses* 
will be the same with or without this "reloadBestPlan" change.  However, the 
plan that the optimizer *generates* could change with this patch.  Let's say 
that the optimizer chooses some access plan for a subquery in some specific join 
order and calls it the "best so far", then it starts processing a new join order 
in which the plan for the subquery changes, and then short-circuit happens 
(either because of the DERBY-1357 changes or because of timeout).  Without the 
"reloadBestPlan" change the optimizer would have *chosen* the "best so far" plan 
as the overall best one--but since it didn't reload the best plan for the 
subquery, it would actually end up *generating* the latest plan (i.e. the plan 
that was found as part of the short-circuited join order).  By setting 
reloadBestPlan to true, we make sure that the optimizer will reload the "best so 
far" plan as its best, and thus that's the plan that will ultimately get generated.

> such that we need to have a release note here?

Hmm, another good question.  I guess I didn't think this warranted a release 
note because it's fixing an existing problem in the optimizer (or actually, two: 
incorrect short-circuit logic and failure to reload best plans when 
short-circuit occurs) and now the optimizer is generating the correct plan based 
on short-circuiting.  But since, as you say, this could cause existing query 
plans to change, I guess this might merit a release note after all...?

If that's the case, then the same will be true of DERBY-781 (and was also true 
of DERBY-1007 and DERBY-805 changes...), and in all cases the release note would 
have to be something generic like "optimizer fixes/enhancements could cause the 
optimizer to choose different query plans, so performance of existing queries 
may change".  Theoretically the performance change should always be a *good* 
thing, and it seems a bit odd to include a release note saying "queries might 
run more quickly now!" :)  However, I guess the practical side of it is that 
some queries could potentially end up with worse query plans, so perhaps a 
release note is a good idea...

If that's the consensus, I'll try to post a suggested release note to this issue 
(and maybe to DERBY-781, as well).

Hopefully that answers your questions--but if not, please feel free to ask more.

And thanks again for the review; I appreciate it!

View raw message