db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <derby-...@db.apache.org>
Subject [jira] Updated: (DERBY-1007) Optimizer can return incorrect "best cost" estimates with nested subqueries, which leads to generation of sub-optimal plans.
Date Wed, 01 Mar 2006 02:26:40 GMT
     [ http://issues.apache.org/jira/browse/DERBY-1007?page=all ]

A B updated DERBY-1007:
-----------------------

    Attachment: d1007_v1.patch
                d1007_v1.stat

Attaching a patch, d1007_v1.patch, to resolve this issue.  The changes are summarized as follows:

1. Added the "prepForNextRound()" method that was part of OptimizerImpl to the Optimizer interface
since that seems like an appropriate place for it.

2. Added a single line to OptimizerImpl.prepForNextRound() to reset the "bestCost" for the
current round of optimization.  Note that I do _not_ reset the "foundABestPlan" variable nor
the "bestJoinOrder" array.  This is because it's possible that a "best join order" may not
exist for the current round, in which case the optimizer must know whether or not it found
a best join order in a previous round (foundABestPlan) and if so what the corresponding join
order was (bestJoinOrder).  That information is required so that a valid query plan can be
generated after optimization is complete.

3. After making the above changes, I noticed that the optimizer cost estimates were not always
showing up when logQueryPlan was set to true--they were sometimes being printed as question
marks to represent "Infinity".  The reason for this was that most of the code in the "modifyAccessPaths"
phase of query compilation uses the estimates as they sit in the ResultSetNode.costEstimate
field--which, for nodes above subqueries, will hold the "bestCost" estimates for the most
recent plan chosen by the OptimizerImpl for the subquery.  Since I am now (with DERBY-1007)
resetting the "bestCost" variable at the start of every round, it's possible that "bestCost"
will hold an estimate that has been "reset" to Double.MAX_VALUE.  This can happen if it was
reset (in prepForNextRound()) and then no valid join order was found for the current round
(ex. if no valid join order exists or if there was an optimizer "timeout").  That in turn
meant that the "costEstimate" field for nodes above the OptimizerImpl would have been "reset"
as well, and hence the "infinity" value (i.e. question mark) was showing up in the logged
query plan.   So I had to find all nodes that use "costEstimate" during modifyAccessPaths()
and update them to use the final, best cost estimate for that node (instead of just using
the most recent value of "costEstimate").  This touched several of ResultSetNode's subclasses,
but the diff in most cases is just a few lines.  The exceptions are FromTable, SelectNode,
UnionNode, IntersectOrExceptNode, and JoinNode, where I added new "getFinalCostEstimate" methods
to correctly figure out the final cost estimate based on the final estimates for child nodes,
as appropriate.

4. The current optimizer "timeout" mechanism is based on the value in OptimizerImpl.bestCost.
 Since I'm now resetting that value at the start of every round, the timeout mechanism had
to be changed in order to preserve its current functionality while removing the dependency
on bestCost.  So I've added a new variable, timeLimit, to OptimizerImpl that plays the same
role w.r.t optimizer "timeout" that the old bestCost value did.

5. Updated master/derived.out to match new behavior.  There's one query in derived.sql that
is affected by this issue.  Before these changes the optimizer thought one join order B was
cheaper than another join order A and so it chose to generate join order B.  With these changes,
though, it now (correctly) sees that join order A and join order B are equivalent, so it just
goes with join order A.  This difference manifests itself in the ordering of two rows in the
result set for that query--so I've updated the masters accordingly.

6. Added a new, simple test case specific to DERBY-1007 to lang/subquery.sql, and updated
the master file accordingly.  The test case is the same one mentioned in the description for
this issue.

I ran derbyall on Red Hat Linux using IBM 1.4.2 with my changes and saw no failures.  I would
very much appreciate any comments/feedback from any available reviewers/ committers out there...

Note to committers: The changes in d1007_v1.patch depend on code changes made as part of d805_phase1_v3.patch
for DERBY-805.  Since that patch was only recently committed, you'll have to be sure to sync
up before trying to apply d1007_v1.patch....

> Optimizer can return incorrect "best cost" estimates with nested subqueries, which leads
to generation of sub-optimal plans.
> ----------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-1007
>          URL: http://issues.apache.org/jira/browse/DERBY-1007
>      Project: Derby
>         Type: Bug
>   Components: Performance
>     Versions: 10.2.0.0
>     Reporter: A B
>     Assignee: A B
>     Priority: Minor
>  Attachments: d1007_v1.patch, d1007_v1.stat
>
> When optimizing a query that has nested subqueries in it, it's possible that the optimizer
for the subqueries will return cost estimates that are lower than what they were actually
calculated to be.  The result is that the outer query can pick an access plan that is sub-optimal.
> Filing this jira issue based on the thread "[OPTIMIZER] OptimizerImpl "best plans" for
subqueries?" from derby-dev.  Description that follows is pasted from that email:
> http://article.gmane.org/gmane.comp.apache.db.derby.devel/14836
> Following example of what I saw when tracing through the code demonstrates the problem.
> select x1.j, x2.b from
>   (select distinct i,j from t1) x1,
>   (select distinct a,b from t3) x2
> where x1.i = x2.a;
> During optimization of this query we will create three instancesof OptimizerImpl:
>    OI_0: For "select x1.j, x2.b from x1, x2 where x1.i = x2.a"
>    OI_1: For "select distinct i,j from t1"
>    OI_2: For "select distinct a,b from t3"
> Query ran against a clean codeline when T1 had 1 row and T3 had 50,000.
>    -- Top-level call is made to the optimize() method of the
>      outermost SelectNode, which creates OI_0.
>    -- OI_0: picks join order {X1, X2} and calls X1.optimizeIt()
>    -- X1: *creates* OI_1 and makes calls to optimize it.
>    -- OI_1: picks join order {T1} and calls T1.optimizeIt()
>    -- T1: returns a cost of 20.
>    -- OI_1: saves 20 as new best cost and tells T1 to save it.
>    -- X1: calls OI_1.getOptimizedCost(), which returns 20.  X1
>      then returns 20 to OI_0.
>    -- OI_0: calls X2.optimizeIt()
>    -- X2: *creates* OI_2 and makes calls to optimize it.
>    -- OI_2: picks join order {T3} and calls T3.optimizeIt()
>    -- T3: returns a cost of 64700.
>    -- OI_2: saves 64700 as new best cost and tells T3 to save it.
>    -- X2: calls OI_2.getOptimizedCost(), which returns 64700. X2
>      then returns 64700 to OI_0.
>    -- OI_0: saves 20 + 64700 = 64720 as new best cost and tells
>      X1 to save 20 and X2 to save 64700.
>    -- OI_0: picks join order {X2, X1} and calls X2.optimizeIt()
>    -- X2: *fetches* OI_2 and makes calls to optimize it.
>    -- OI_2: picks join order {T3} and calls T3.optimizeIt()
>    -- T3: returns a cost of 10783.
>    -- OI_2: saves 10783 as new best cost and tells T3 to save it.
>    -- X2: calls OI_2.getOptimizedCost(), which returns 10783.  X2
>      then returns 10783 to OI_0.
>    -- OI_0: calls X1.optimizeIt()
>    -- X1: *fetches* OI_1 and makes calls to optimize it.
>    -- OI_1: picks join order {T1} and calls T1.optimizeIt()
>    -- T1: returns a cost of *1 MILLION!*.
>    -- OI_1: rejects new cost (1 mil > 20) and does nothing.
>    -- X1: calls OI_1.getOptimizedCost(), which returns *20*.  X1
>      then returns 20 to OI_0...this seems WRONG!
>    -- OI_0: saves 10783 + 20 = 10803 as new best cost and tells
>      X2 to save 10783 and X1 to save 20.
> So in the end, the outer-most OptimizerImpl chooses join order {X2, X1} because it thought
the cost of this join order was only 10783, which is better than  64720.  However, the _actual_
cost of the join order was really estimated at 1 million--so the outer OptimizerImpl chose
(and will generate) a plan that, according to the estimates, was (hugely) sub-optimal.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message