db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-4620) Query optimizer causes OOM error on a complex query
Date Wed, 29 Feb 2012 17:53:58 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219361#comment-13219361
] 

Knut Anders Hatlen commented on DERBY-4620:
-------------------------------------------

I took a look at this issue again, seeing if it was possible to move it further. (I cannot
find my copy of the repro anymore, though. And the link to the repro test case appears to
be dead.)

The current patch makes the optimizer's memory estimates more accurate. Since the existing
code underestimates the memory usage, derby.language.maxMemoryPerTable effectively has a larger
value than specified. So when we fix the estimates, it might be reasonable to increase derby.language.maxMemoryPerTable
to compensate, so that the chosen plans stay the same for most existing applications.

I ran the tests that failed because of changed plans, with different values for derby.language.maxMemoryPerTable.
lang/subquery.sql (now converted to JUnit and called SubqueryTest) ran fine when I doubled
maxMemoryPerTable to 2048 on a 32-bit JVM. However, on a 64-bit JVM it still failed with that
setting. Increasing it further up to 4096 made the test pass both on 32-bit and on 64-bit.

lang/wisconsin.java was trickier. Increasing maxMemoryPerTable to 2048 fixed the queries that
previously failed because nested loop join was chosen instead of the expected hash join, but
it additionally made some queries that were expected to use a nested loop join pick a hash
join instead. And there were differences in plans between 32-bit and 64-bit JVMs in this test
too.

So it looks as if it's difficult to find a setting that makes the optimizer pick the same
plans for all queries after fixing the estimates. Perhaps it would be OK to only fix the estimates
in this issue, and update the tests to expect the new plans? A release note could point users
to the derby.language.maxMemoryPerTable property in case they experience that poorer plans
are picked for some of their queries. Another issue could be filed to investigate what would
be a reasonable default value for derby.language.maxMemoryPerTable.

Alternatively, we could pick a higher maxMemoryPerTable in this issue, even if it changes
some of the plans in lang/wisconsin.java. Since those changes would be from nested loop joins
to hash joins, the new plans should be faster then the old plans. There's always a risk that
increasing maxMemoryPerTable will cause higher memory usage for some applications, though,
and possibly OOME, even if the new and more accurate estimates compensate for the increased
value in many cases.
                
> Query optimizer causes OOM error on a complex query
> ---------------------------------------------------
>
>                 Key: DERBY-4620
>                 URL: https://issues.apache.org/jira/browse/DERBY-4620
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.3.0
>         Environment: java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
> Linux rocio.xxx 2.6.24-27-generic #1 SMP Fri Mar 12 01:10:31 UTC 2010 i686 GNU/Linux
>            Reporter: Chris Wilson
>              Labels: derby_triage10_8
>         Attachments: estimate-sizes.diff, query-plan.txt
>
>
> I have a query that generates about 2,000 rows in a summary table. It runs fast enough
(~30 seconds) on MySQL. The same query on Derby runs for about 10 minutes and then fails with
an OutOfMemoryError.
> I have created a test case to reproduce the problem. It's not minimal because it relies
on a rather large dataset to reproduce it, and it's not trivial, but I don't mind doing a
bit of work trimming it if someone can point me in the necessary direction.
> You can check out the test case from our Subversion server here:
>   http://rita.wfplogistics.org/svn/trackrita/rita/doc/derby-oom-slow-query
> which includes a pre-built Derby database in "testdb.derby". If this database is deleted,
"test.sh" will recreate it, but that takes about 10-15 minutes.
> Just modify the script "test.sh" to point to your Derby libraries, and run it (or just
execute the commands in "movement_complete.sql") to demonstrate the problem. You can view
the source of that file online here:
>   http://rita.wfplogistics.org/trac/browser/rita/conf/movement_complete.sql
> The first "insert into movement_complete" (starting around line 4) takes about 15 seconds
to complete and inserts 5890 rows. The second, starting around line 54, does not complete
in reasonable time on Derby. On MySQL, it runs in 28 seconds and inserts 2038 rows. On Derby,
after 10 minutes I get:
> JAVA ERROR: java.lang.OutOfMemoryError: Java heap space
> ij> ERROR X0Y67: Cannot issue rollback in a nested connection when there 
> is a pending operation in the parent connection.
> (process exits)
> It does not output the query plan in this case.
> Following the suggestion of Bryan Pendleton, I tried increasing the JVM memory limit
from the default to 1024m, and this allows the query to finish executing quite quickly. I
guess that means that the optimiser is just taking a lot of memory to 
> optimise the query, and it spends forever in GC before finally hitting OOM and giving
up when using the default settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message