db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-3214) Optimizer can see negative cost estimates when pulling Optimizables from the join order.
Date Wed, 12 Dec 2007 16:19:43 GMT

     [ https://issues.apache.org/jira/browse/DERBY-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

A B updated DERBY-3214:
-----------------------

       Derby Info:   (was: [Patch Available])
    Fix Version/s: 10.4.0.0

Thank you for the review, Thomas.  I committed the patch to trunk with svn # 603659:

  http://svn.apache.org/viewvc?rev=603659&view=rev

> Optimizer can see negative cost estimates when pulling Optimizables from the join order.
> ----------------------------------------------------------------------------------------
>
>                 Key: DERBY-3214
>                 URL: https://issues.apache.org/jira/browse/DERBY-3214
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6, 10.2.2.0, 10.3.1.4
>            Reporter: A B
>            Assignee: A B
>            Priority: Minor
>             Fix For: 10.4.0.0
>
>         Attachments: d3214_v1.patch, doubleAdd.java
>
>
> When iterating through join order permutations for a query the optimizer places "Optimizables"
(FromTables) into a join order, estimates the cost, then "pulls" Optimizables out of the join
order and re-places them in different positions.  For details see:
>   http://wiki.apache.org/db-derby/JoinOrderPermutations
> As optimizables are added to the join order the optimizer keeps track of the accumulated
cost estimate for the join order.  Then when an Optimizable is removed ("pulled") from the
join order, the optimizer substracts that Optimizable's cost from the total accumulated cost.
> In certain cases (esp. with very large queries) it's possible that the cost for some
Optimizable OPT_A is so large that adding it to the accumulated cost of the join order leads
to loss of the previous sum.  This happens due to normal Java addition of double values, see
"doubleAdd.java" attached.
> As an example, assume our current join order is:
>   { OPT_0, OPT_1, -- }
> and that the estimated costs for OPT_0 and OPT_1 are 700 and 800, respectively.  The
accumulated cost for OPT_0 and OPT_1 is then 700 + 800 = 1500.  Then assume we place OPT_A
into the final position in the join order:
>   { OPT_0, OPT_1, OPT_A }
> If the cost of OPT_A is something that is orders of magnitude larger than 1500, then
by adding it to 1500 we will effectively "lose" the 1500.  Let's say the cost of OPT_A is
estimated to be 3.14E50 (which is actually possible, esp. as a result of DERBY-1905).  The
size of OPT_A's cost makes the cost of OPT_0 + OPT_1 insignificant when using Java doubles
(see attached doubleAdd.java):
>   1500 + 3.14E50 = 3.14E50
> So the total accumulated cost for the join order is now 3.14E50.  Later, when we go to
pull OPT_A from the join order, we'll subtract its cost from the accumulated cost, yielding:
>   3.14E50 - 3.14E50 = 0
> Notice how the accumulated cost, which is supposed to represent the cost of OPT_0 plus
the cost of OPT_1, is now ZERO.  And our join order goes back to:
>   { OPT_0, OPT_1, -- }
> Next we pull OPT_1 from the join order, which means we have to subtract it's cost from
the accumulated cost:
>   0 - 800 = -800
> So we end up with a negative accumulated cost, which is WRONG. (Actually, the ZERO accumulated
cost in the previous step was wrong; this is just a side effect).
> As it turns out, there is code in OptimizerImpl that tries to account for negative costs
when the negative value comes from normal imprecise arithmetic.  In particular we see the
following in the code that pulls Optimizables:
>     double newCost = currentCost.getEstimatedCost();
>     if (pullCostEstimate != null)
>     {
>         pullCost = pullCostEstimate.getEstimatedCost();
>         newCost -= pullCost;
>         /*
>          ** It's possible for newCost to go negative here due to
>          ** loss of precision.
>          */
>          if (newCost < 0.0)
>              newCost = 0.0;
>         ...
>     }
> This code hides the error mentioned above because when it sees "-800" it assumes that
the negative stems from normal loss of precision.  So the cost for the plan is incorrectly
set to "0", which makes it cheaper than any other plan thus far (and probably cheaper than
anything to come), and therefore the optimizer will probably choose the wrong plan.
> I think the check for negative newCost is only valid if the join position that we're
pulling is 0--i.e. if we just pulled the first optimizable in the join order.  In that case
the accumulated cost *should* be zero, so checking for a negative value and setting it to
zero is fine--we're just accounting for the loss of precision that is mentioned in the current
code comments.
> Note that the same issue also exists for "sort avoidance costs", but in that code there
is (currently) no check for negative costs.  So if a situation as described above occurs in
the current code when the cost of OPT_A is for a sort avoidance plan, the code will throw
an ASSERTION failure because the cost should be non-negative.
> I noticed this behavior somewhat accidentally while testing out a fix for DERBY-3023:
with my attempted fix applied, the query "new-style-sql.txt" was failing with an ASSERTION
failure due to the negative cost estimate.  Hence this jira.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message