db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2130) Optimizer performance slowdown from 10.1 to 10.2
Date Thu, 30 Nov 2006 20:12:21 GMT
    [ http://issues.apache.org/jira/browse/DERBY-2130?page=comments#action_12454745 ] 
A B commented on DERBY-2130:

> Some items might be:

> 1) update the store costs to reflect current machines
> o update the store costs to reflect "actual" machine it is running on
> o figure out why there is such variance in DERBY-2130
> o investigate why optimizer costs are so out of line - is it just
>  wrong logic, is it lack of some sort of cost estimate so one bad
>  guess on top of another.
> o DERBY-47

> I am interested in doing 1, but I don't think it is going to help
> with infinite costs.

> Is anyone else interested in looking at this?

I am interested both in this particular issue (DERBY-2130) and also in #3: why the optimizer
cost estimates are so out of line.  I.e. DERBY-1905.  I haven't started just yet but I would
like to look more into DERBY-1905 to see if I can pull anything up.  I hope to start looking
at this very soon...

For what it's worth, I ran the repro attached to this issue with various sets of Derby insane
jars, and the timings I saw were as follows.  I ran the repro 3 times for each set of jars
on Red Hat Linux using ibm142.  Unit of measurement is milliseconds (returned via the "elapsedtime
on" ij command).

10.1 Pre-805:

  - 141570
  - 142463
  - 140686

10.2 Pre-805:

  - 143794
  - 139213
  - 138892

10.1 latest branch:

  - 43494
  - 37229
  - 37532

10.2 Post-805, Pre-781 (, svn 423199)

  - NPE (caused by DERBY-1777). 

10.2 latest branch:

  - 470067
  - 438084
  - 433038

Things to notice:

 * It looks like the slow-down is *not* caused by the DERBY-805 changes--at least not directly.
 On the contrary, the query actually runs quite a bit more quickly with the DERBY-805 changes
(and all subsequent patches, esp. DERBY-1777) than it does without.  Or put another way, the
query runs faster with the latest 10.1 codeline than it does with  Of course, it's
possible that changes for DERBY-805 are mingling with more recent changes to cause the slowdown,
but we at least know that the DERBY-805 changes alone are not the problem.

 * The run at revision 423199 fails with an NPE, which corresponds to Bryan's results when
he ran with  In both cases the DERBY-805 changes have been committed but the subsequent
regression fixes (including DERBY-1777) have not, hence the NPE.

 * I did not see the non-deterministic behavior that Bryan did with 10.2 trunk--instead, I
consistently see 7+ minutes.  However, I only ran the query 3 times, which is probably not
enough to have seen what Bryan saw.  Also note that I only ran the DDL one time and then I
re-ran the query the second two times.  So given Mike's previous comment, this may be the
reason I didn't see the variance that Bryan did.

Note sure how useful this information is, but that's what I've found so far.  Still scratching
my head and poking around, though, so maybe something will come up. (My inclination is toward
DERBY-781, but I don't really know how just yet...)

> Optimizer performance slowdown from 10.1 to 10.2
> ------------------------------------------------
>                 Key: DERBY-2130
>                 URL: http://issues.apache.org/jira/browse/DERBY-2130
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL, Performance
>    Affects Versions:,,
>            Reporter: Bryan Pendleton
>         Attachments: repro.sql
> Attached is 'repro.sql', an IJ script which demonstrates what I
> believe to be a serious performance issue in the Optimizer.
> I have run this script in a number of configurations:
>  - the script runs successfully. The 'prepare' statement
>    takes about 90 seconds, on a fairly powerful Windows machine
>  - the script produces a NPE. I believe this is DERBY-1777
>  - the script runs successfully. The 'prepare' statement
>    often takes about 220 seconds, on the same Windows machine
>    Intermittently, on 10.2 and on the trunk, the prepare statement takes
>    15+ minutes. I cannot reliably reproduce this; I run the same script
>    several times in a row and I cannot predict whether it will take 220
>    seconds or whether it will take 15+ minutes.
> I am quite motivated to work on this problem, as this is blocking me from
> using Derby for a project that I'm quite keen on, but I need some
> suggestions and ideas about how to attack it. From my perspective
> there are 3 primary topics:
> 1) Why did optimizer performance for this query degrade so significantly
> from to 10.2? The optimizer seems to be at least 2.5 times slower,
> for this particular query at least, in 10.2. Sometimes it is 10x slower.
> 2) What is the source of the non-determinism? Why does the optimizer
> often take 4 minutes to optimize this query on the trunk, but sometimes
> take 15+ minutes? I don't believe that I'm changing anything from
> run to run.
> 3) Can we improve the optimizer performance even beyond what it was
> for 10.1.2? I realize that this is an ugly query, but I was hoping to
> see an optimization time of 5-10 seconds, not 90 seconds (and certainly
> not 220 seconds).
> I have attempted to start answering some of these questions, with
> limited success. Here is some of what I think I've discovered so far:
>  - the optimizer changes in 10.2 seem to have given the optimizer many
>    more choices of possible query plans to consider. I think this means
>    that, if the optimizer does not time out, it will spend substantially
>    more time optimizing because there are more choices to evaluate. Does
>    this by itself mean that the optimizer will take 2.5 times longer in
>    10.2 than it did in 10.1?
>  - something about this query seems to make the costing mechanism go
>    haywire, and produce extreme costs. While stepping through the
>    optimization of this query in the debugger I have seen it compute
>    costs like 1e63 and 1e200. This might be very closely related to
>    DERBY-1905, although I don't think I'm doing any subqueries here.
>    But maybe I'm misunderstanding the term "subquery" in DERBY-1905.
>    At any rate, due to the enormous estimated costs, timeout does not
>    occur.
>  - the WHERE clause in this query is converted during compilation to 
>    an equivalent IN clause, I believe, which then causes me to run into
>    a number of the problems described in DERBY-47 and DERBY-713.
>    Specifically, rather than constructing a plan which involves 4
>    index probes for the 4 WHERE clause values, the optimizer decides
>    that an index scan must be performed and that it will have to process
>    the entire index (because the query uses parameter markers, not
>    literal values). So perhaps solving DERBY-47 would help me
>  - the optimizer in fact comes up with a "decent" query plan quite quickly.
>    I have experimented with placing a hard limit into the optimizer
>    timeout code, so that I can force optimization to stop after an
>    arbitrary fixed period of time. Then I have been able to set that
>    value to as low as 1 second, and the optimizer has produced plans
>    that then execute in a few milliseconds. Of course, I have only tried
>    this with a trivial amount of data in my database, so it's possible
>    that the plan produced by the optimizer after just a second of
>    optimizing is in fact poor, and I'm just not noticing it because my
>    data sizes are so small.
> At this point, what would be really helpful to me would be some suggestions
> about some general approaches or techniques to try to start breaking down
> and analyzing this problem.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message