db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2130) Optimizer performance slowdown from 10.1 to 10.2
Date Sat, 09 Dec 2006 16:14:21 GMT
    [ http://issues.apache.org/jira/browse/DERBY-2130?page=comments#action_12457083 ] 
            
Bryan Pendleton commented on DERBY-2130:
----------------------------------------

> My guess is that if you created 10.1 database A, and ran 10.3 against
> it, and then created 10.1 database B, and ran 10.3 against it, you
> might see dramatic differences even in just those two runs.

I was able to confirm this.

I created 10 separate databases using 10.1.2.1, and ran the DDL portion of repro.sql
against each of those 10 databases using 10.1.2.1.

Then I ran just the PREPARE portion of repro.sql against each of those 10 databases,
using the 10.3 trunk, and setting derby.database.allowPreReleaseUpgrade=true so
that 10.3 would be willing to access the 10.1 database.

Often, I observed 170-180 seconds for the prepare.

But I also observed 500-650 seconds for the prepare.

Furthermore, it wasn't on the same database each time!

That is, once I saw 500 seconds for the prepare on database 6, but I reran DB 6
and saw 168 seconds.

And, the first time I did the prepare on DB 8, it was 170 seconds, but the second
time I did the prepare on DB 8, it was 607 seconds.

So my observations are that:
 - 10.3 is non-deterministic regarding the amount of time that the PREPARE takes
 - it happens whether the DB and schema are created using 10.1, or 10.3
 - the same DB, with the same DDL, can sometimes prepare the query in time X,
   and sometimes in time Y, and as far as I can tell nothing is changing
   between runs. I just run IJ with the "prepare" script twice in a row, and
   get two very different elapsed times.

> I'm wondering if there is something going on, correlated to the DDL
> statements executed by the repro script, which sometimes results in
> tables that return "reasonable" estimates, and sometimes results in
> tables that return "infinite" estimates.

I now believe this is *not* an accurate theory, as the non-determinism does not
seem to be correlated to a particular run of the DDL statements.

So at this point I don't think that the non-determinism depends on whether the
database and its tables were created by 10.1, or by 10.3, and I don't think that
the non-determinism is "tied" to a particular database and set of tables; the
same database and tables can sometimes lead to one prepare time, and sometimes
to a different prepare time.


> Optimizer performance slowdown from 10.1 to 10.2
> ------------------------------------------------
>
>                 Key: DERBY-2130
>                 URL: http://issues.apache.org/jira/browse/DERBY-2130
>             Project: Derby
>          Issue Type: Bug
>          Components: Performance, SQL
>    Affects Versions: 10.2.1.6, 10.3.0.0, 10.1.3.1
>            Reporter: Bryan Pendleton
>         Attachments: plan10_1_2_1.txt, plan10_2.txt, plans.diff, repro.sql
>
>
> Attached is 'repro.sql', an IJ script which demonstrates what I
> believe to be a serious performance issue in the Optimizer.
> I have run this script in a number of configurations:
>  - 10.1.2.1: the script runs successfully. The 'prepare' statement
>    takes about 90 seconds, on a fairly powerful Windows machine
>  - 10.1.3.1: the script produces a NPE. I believe this is DERBY-1777
>  - 10.2.1.8/trunk: the script runs successfully. The 'prepare' statement
>    often takes about 220 seconds, on the same Windows machine
>    Intermittently, on 10.2 and on the trunk, the prepare statement takes
>    15+ minutes. I cannot reliably reproduce this; I run the same script
>    several times in a row and I cannot predict whether it will take 220
>    seconds or whether it will take 15+ minutes.
> I am quite motivated to work on this problem, as this is blocking me from
> using Derby for a project that I'm quite keen on, but I need some
> suggestions and ideas about how to attack it. From my perspective
> there are 3 primary topics:
> 1) Why did optimizer performance for this query degrade so significantly
> from 10.1.2.1 to 10.2? The optimizer seems to be at least 2.5 times slower,
> for this particular query at least, in 10.2. Sometimes it is 10x slower.
> 2) What is the source of the non-determinism? Why does the optimizer
> often take 4 minutes to optimize this query on the trunk, but sometimes
> take 15+ minutes? I don't believe that I'm changing anything from
> run to run.
> 3) Can we improve the optimizer performance even beyond what it was
> for 10.1.2? I realize that this is an ugly query, but I was hoping to
> see an optimization time of 5-10 seconds, not 90 seconds (and certainly
> not 220 seconds).
> I have attempted to start answering some of these questions, with
> limited success. Here is some of what I think I've discovered so far:
>  - the optimizer changes in 10.2 seem to have given the optimizer many
>    more choices of possible query plans to consider. I think this means
>    that, if the optimizer does not time out, it will spend substantially
>    more time optimizing because there are more choices to evaluate. Does
>    this by itself mean that the optimizer will take 2.5 times longer in
>    10.2 than it did in 10.1?
>  - something about this query seems to make the costing mechanism go
>    haywire, and produce extreme costs. While stepping through the
>    optimization of this query in the debugger I have seen it compute
>    costs like 1e63 and 1e200. This might be very closely related to
>    DERBY-1905, although I don't think I'm doing any subqueries here.
>    But maybe I'm misunderstanding the term "subquery" in DERBY-1905.
>    At any rate, due to the enormous estimated costs, timeout does not
>    occur.
>  - the WHERE clause in this query is converted during compilation to 
>    an equivalent IN clause, I believe, which then causes me to run into
>    a number of the problems described in DERBY-47 and DERBY-713.
>    Specifically, rather than constructing a plan which involves 4
>    index probes for the 4 WHERE clause values, the optimizer decides
>    that an index scan must be performed and that it will have to process
>    the entire index (because the query uses parameter markers, not
>    literal values). So perhaps solving DERBY-47 would help me
>  - the optimizer in fact comes up with a "decent" query plan quite quickly.
>    I have experimented with placing a hard limit into the optimizer
>    timeout code, so that I can force optimization to stop after an
>    arbitrary fixed period of time. Then I have been able to set that
>    value to as low as 1 second, and the optimizer has produced plans
>    that then execute in a few milliseconds. Of course, I have only tried
>    this with a trivial amount of data in my database, so it's possible
>    that the plan produced by the optimizer after just a second of
>    optimizing is in fact poor, and I'm just not noticing it because my
>    data sizes are so small.
> At this point, what would be really helpful to me would be some suggestions
> about some general approaches or techniques to try to start breaking down
> and analyzing this problem.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message