Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Message-ID: <12343734.80101271756389980.JavaMail.jira@thor>
Date: Tue, 20 Apr 2010 05:39:49 -0400 (EDT)
From: "Bryan Pendleton (JIRA)" <jira@apache.org>
To: derby-dev@db.apache.org
Subject: [jira] Commented: (DERBY-4620) Query optimizer causes OOM error on
 a complex query
In-Reply-To: <25672450.32181271596825647.JavaMail.jira@thor>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/DERBY-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12858745#action_12858745 ] 

Bryan Pendleton commented on DERBY-4620:
----------------------------------------

Chris, thanks for posting the detailed heap information.

Your problem seems in some ways similar to DERBY-106, which was
supposed to be fixed a long time ago.

My vague understanding of how hash joins are supposed to use memory is:
1) The HashJoin code is supposed to call some JVM call (Runtime.getFreeMemory?)
    to figure out how much memory is available, and then only use a small part of it.
2) The execution of a HashJoin is supposed to use a BackingStoreHashTable,
    which is supposed to be able to gracefully overflow to disk when there is
    not enough memory. Here's an ancient message from Jack about how this
    is supposed to work: http://www.mail-archive.com/derby-dev@db.apache.org/msg01914.html
    and a related message from Mike about possible problems with it:
    http://www.mail-archive.com/derby-dev@db.apache.org/msg01360.html
3) There is an undocumented derby.language.maxMemoryPerTable setting that
    is supposed to influence this processing (see DERBY-1397)

It seems clear that, in your case, the hash join strategy is doubly flawed:
1) To run efficiently, it needed much more memory than you initially gave it, so
    since that memory wasn't available, the Optimizer shouldn't have chosen the hash joins
2) Having chosen the hash joins, the execution code isn't supposed to run out of memory,
    because it's supposed to use only a portion of the available memory and then
    start overflowing to disk.

Here's an interesting message from Army from a few years back discussing 
some of these issues in a different scenario:
http://www.mailinglistarchive.com/html/derby-user@db.apache.org/2008-07/msg00897.html

I have this vague memory that, some time ago, we looked at a problem where it
seemed that BackingStoreHashTable wasn't behaving correctly, but I don't recall
the outcome of that conversation and I wasn't successful searching for it. This thread
seems to discuss the issues in great detail, but I don't know what came of it, entirely,
as it was rather before my time with Derby:
http://mail-archives.apache.org/mod_mbox/db-derby-dev/200501.mbox/%3C41F6A8F9.7030806@Mutagen.Net%3E

I'm not quite sure where to take this information next, but I wanted to get it into
the JIRA entry while I was thinking about it.


> Query optimizer causes OOM error on a complex query
> ---------------------------------------------------
>
>                 Key: DERBY-4620
>                 URL: https://issues.apache.org/jira/browse/DERBY-4620
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.3.0
>         Environment: java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
> Linux rocio.xxx 2.6.24-27-generic #1 SMP Fri Mar 12 01:10:31 UTC 2010 i686 GNU/Linux
>            Reporter: Chris Wilson
>         Attachments: query-plan.txt
>
>
> I have a query that generates about 2,000 rows in a summary table. It runs fast enough (~30 seconds) on MySQL. The same query on Derby runs for about 10 minutes and then fails with an OutOfMemoryError.
> I have created a test case to reproduce the problem. It's not minimal because it relies on a rather large dataset to reproduce it, and it's not trivial, but I don't mind doing a bit of work trimming it if someone can point me in the necessary direction.
> You can check out the test case from our Subversion server here:
>   http://rita.wfplogistics.org/svn/trackrita/rita/doc/derby-oom-slow-query
> which includes a pre-built Derby database in "testdb.derby". If this database is deleted, "test.sh" will recreate it, but that takes about 10-15 minutes.
> Just modify the script "test.sh" to point to your Derby libraries, and run it (or just execute the commands in "movement_complete.sql") to demonstrate the problem. You can view the source of that file online here:
>   http://rita.wfplogistics.org/trac/browser/rita/conf/movement_complete.sql
> The first "insert into movement_complete" (starting around line 4) takes about 15 seconds to complete and inserts 5890 rows. The second, starting around line 54, does not complete in reasonable time on Derby. On MySQL, it runs in 28 seconds and inserts 2038 rows. On Derby, after 10 minutes I get:
> JAVA ERROR: java.lang.OutOfMemoryError: Java heap space
> ij> ERROR X0Y67: Cannot issue rollback in a nested connection when there 
> is a pending operation in the parent connection.
> (process exits)
> It does not output the query plan in this case.
> Following the suggestion of Bryan Pendleton, I tried increasing the JVM memory limit from the default to 1024m, and this allows the query to finish executing quite quickly. I guess that means that the optimiser is just taking a lot of memory to 
> optimise the query, and it spends forever in GC before finally hitting OOM and giving up when using the default settings.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.