db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-4620) Query optimizer causes OOM error on a complex query
Date Tue, 05 Jul 2011 13:28:17 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059896#comment-13059896
] 

Knut Anders Hatlen commented on DERBY-4620:
-------------------------------------------

The data structures are somewhat wasteful, but I'm not sure there's so much we can do about
it. An SQLInteger object takes 16 bytes of memory, which sounds large when compared to a primitive
int, which is 4 bytes. But it takes no more memory than a java.lang.Integer object, which
also weighs in at 16 bytes. As long as we're using java.util.HashMap to store the hash table
in memory, we cannot use primitives as keys, so I can't see that we can do any better than
java.lang.Integer/iapi.types.SQLInteger for single-column keys.

For multi-column keys consisting of simple values there might be a possibility to do better,
as for example an int[] array is more compact than an Object[] array. But we could only take
advantage of it if all the columns in the key are of the same type (since an int[] array can
only store ints). And we'd also need special logic to handle the cases where we're joining
columns of different types (for example INT columns in one table joined with SMALLINT columns
in another table), cases where the SQLInteger and SQLSmallint classes would have handled the
comparison correctly with no extra logic. So I have the feeling that reducing the memory usage
significantly would be difficult without adding a lot of complexity.

Quick update about the testing of the patch: There was one failure in the JUnit regression
tests, but it looked unrelated (in InterruptResilienceTest). Will file a bug for that. Derbyall
had failures in lang/subquery.sql and lang/wisconsin.java because of changes in some of the
query plans (expected hash scans, but found index scans and nested loop joins). The repro
for the bug finally completed successfully after 10 hours when run with -Xmx62M.

> Query optimizer causes OOM error on a complex query
> ---------------------------------------------------
>
>                 Key: DERBY-4620
>                 URL: https://issues.apache.org/jira/browse/DERBY-4620
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.3.0
>         Environment: java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
> Linux rocio.xxx 2.6.24-27-generic #1 SMP Fri Mar 12 01:10:31 UTC 2010 i686 GNU/Linux
>            Reporter: Chris Wilson
>              Labels: derby_triage10_8
>         Attachments: estimate-sizes.diff, query-plan.txt
>
>
> I have a query that generates about 2,000 rows in a summary table. It runs fast enough
(~30 seconds) on MySQL. The same query on Derby runs for about 10 minutes and then fails with
an OutOfMemoryError.
> I have created a test case to reproduce the problem. It's not minimal because it relies
on a rather large dataset to reproduce it, and it's not trivial, but I don't mind doing a
bit of work trimming it if someone can point me in the necessary direction.
> You can check out the test case from our Subversion server here:
>   http://rita.wfplogistics.org/svn/trackrita/rita/doc/derby-oom-slow-query
> which includes a pre-built Derby database in "testdb.derby". If this database is deleted,
"test.sh" will recreate it, but that takes about 10-15 minutes.
> Just modify the script "test.sh" to point to your Derby libraries, and run it (or just
execute the commands in "movement_complete.sql") to demonstrate the problem. You can view
the source of that file online here:
>   http://rita.wfplogistics.org/trac/browser/rita/conf/movement_complete.sql
> The first "insert into movement_complete" (starting around line 4) takes about 15 seconds
to complete and inserts 5890 rows. The second, starting around line 54, does not complete
in reasonable time on Derby. On MySQL, it runs in 28 seconds and inserts 2038 rows. On Derby,
after 10 minutes I get:
> JAVA ERROR: java.lang.OutOfMemoryError: Java heap space
> ij> ERROR X0Y67: Cannot issue rollback in a nested connection when there 
> is a pending operation in the parent connection.
> (process exits)
> It does not output the query plan in this case.
> Following the suggestion of Bryan Pendleton, I tried increasing the JVM memory limit
from the default to 1024m, and this allows the query to finish executing quite quickly. I
guess that means that the optimiser is just taking a lot of memory to 
> optimise the query, and it spends forever in GC before finally hitting OOM and giving
up when using the default settings.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message