db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@gmail.com>
Subject Re: Advice in debugging plan selection
Date Fri, 13 Jun 2008 19:02:40 GMT
Kathey Marsden wrote:
> The test sets derby.optimizer.noTimeout so it should come up with the 
> lower cost  plan each time.

This is probably silly, but have you confirmed that the property is in 
fact being set correctly?  If you output the value of the "noTimeout" 
field from the OptimizerImpl constructor, does it return "true"?

> My question is where in the code should I look for possible
> problems with plan selection.

Unfortunately there's no single place where these kinds of problems 
occur.  The underlying cause can be in any number of places and may not 
be easily discernible...

> I can't reproduce in the debugger so will need to just use println's to 
> diagnose the problem.

If you're in a situation where you have to use printlns, you may want to 
start by printing out the contents of the bestJoinOrder array in 
OptimizerImpl in two places: 1) at the end of rememberBestCost(), and 2) 
at the end of modifyAccessPaths().  The former tells you every time the 
OptimizerImpl thinks it has found a new "best join order" thus far in 
processing, the latter tells you what the OptimizerImpl's _final_ choice 
of "best join order" was.  If you print out the cost of the join orders 
(bestCost) as well, and then compare the results from a "fail" plan and 
a "pass" plan, that might give you an indication of where to start.  But 
as I said, there's no silver bullet in tracking that kind of thing down...

 From a very quick glance at the two plans I noticed two things: 1) the 
cost estimates for the "fail" plan are pretty outrageous--perhaps 
DERBY-1905 at play here?--which means the values themselves are not 
really useful for comparison; and 2) the "pass" plan uses a Hash Join 
while the "fail" plan uses a Nested Loop join.  With respect to #2, the 
optimizer will deliberately skip a hash join if it does not think the 
JVM has enough memory to handle all of the rows.  So is it possible that 
in the runs where the query fails the JVM has less available memory?

In a related note, I think the comments before the query in question say:

         // In this query the optimizer will consider pushing, but
         // will find that it's cheaper to do a hash join and thus
         // will _not_ push. So we see hash join with table scan on T3.

The "fail" query plan shows that the predicate _is_ actually being 
pushed to T1 and T3 (it ends up as a qualifier on T1 and as an index key 
for T3).  Given the ridiculously high cost estimates of doing so, it 
seems like the optimizer would only have chosen to push the predicates 
as a last resort--i.e. if the hash join was not possible due to memory 
constraints.  At least that's what I hope based on the plans you 
provided.  It's also possible that for whatever reason the optimizer's 
estimates for the "fail" run are somehow skewed so that the nested loop 
join is actually viewed as "cheaper".  If that happens then you're 
probably looking at DERBY-1905...

To see if memory is an issue for the hash join, maybe search for 
"memoryUsageOK" in OptimizerImpl and print out the results of those 
calls...?

Army


Mime
View raw message