Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 89660 invoked from network); 29 Jun 2009 23:46:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Jun 2009 23:46:00 -0000 Received: (qmail 54509 invoked by uid 500); 29 Jun 2009 23:46:10 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 54451 invoked by uid 500); 29 Jun 2009 23:46:10 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 54442 invoked by uid 99); 29 Jun 2009 23:46:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 23:46:10 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 23:46:08 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CDB0C234C04C for ; Mon, 29 Jun 2009 16:45:47 -0700 (PDT) Message-ID: <189878265.1246319147841.JavaMail.jira@brutus> Date: Mon, 29 Jun 2009 16:45:47 -0700 (PDT) From: "Dag H. Wanvik (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Updated: (DERBY-1259) Optimizer plan consideration doesn't account for infinite cost estimates and can therefore choose plans requiring excessive memory. In-Reply-To: <52597527.1146117002986.JavaMail.root@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dag H. Wanvik updated DERBY-1259: --------------------------------- Component/s: SQL > Optimizer plan consideration doesn't account for infinite cost estimates and can therefore choose plans requiring excessive memory. > ----------------------------------------------------------------------------------------------------------------------------------- > > Key: DERBY-1259 > URL: https://issues.apache.org/jira/browse/DERBY-1259 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.1.3.1, 10.2.1.6 > Environment: Windows 2000, ibm142 > Reporter: A B > Priority: Minor > > When deciding whether or not to choose a particular access path as the "best one so far", the optimizer tries to guess what the memory usage for the path will be and, if it's prohibitive, it (the optimizer) will reject the plan. Currently this only applies to hash joins in Derby. > The call that starts the check for excessive memory exists in two places in OptimizerImpl.java: costBasedCostOptimizable() and considerCost(). There we have the following: > /* > ** Skip this access path if it takes too much memory. > ** > ** NOTE: The default assumption here is that the number of rows in > ** a single scan is the total number of rows divided by the number > ** of outer rows. The optimizable may over-ride this assumption. > */ > if( ! optimizable.memoryUsageOK(estimatedCost.rowCount() / outerCost.rowCount(), maxMemoryPerTable)) > { > if (optimizerTrace) > { > trace(SKIPPING_DUE_TO_EXCESS_MEMORY, 0, 0, 0.0, null); > } > return; > } > However, if the outerCost has an estimated row count of Double.POSITIVE_INFINITY, which can happen if the query is very deeply nested and/or has a lot of FROM tables/expressions, the division of estimatedCost.rowCount() by outerCost.rowCount() will result in a "NaN" value. If that value is then passed to FromTable (which is the base implementation of an Optimizable), the memoryUsageOK method looks like this: > public boolean memoryUsageOK( double rowCount, int maxMemoryPerTable) > throws StandardException > { > /* > ** Don't enforce maximum memory usage for a user-specified join > ** strategy. > */ > if( userSpecifiedJoinStrategy != null) > return true; > int intRowCount = (rowCount > Integer.MAX_VALUE) ? Integer.MAX_VALUE : (int) rowCount; > return intRowCount <= maxCapacity( getCurrentAccessPath().getJoinStrategy(), maxMemoryPerTable); > } > If rowCount is "NaN", the comparison to see if it's greater than MAX_VALUE will return false, which means that intRowCount gets set to "(int)rowCount". But when we cast rowCount, which is "NaN" represented by a double, to an int the result is 0. The final check then becomes "0 <= maxCapacity(...)", which will always return true. Thus regardless of what the estimated cost for the optimizable is, the "memoryUsageOK" check will always return true if the outer cost is infinity, and thus the optimizer could very well decide to choose a path that it should have rejected because of excessive memory requirements (where "should" means based on the estimates; the accuracy of the estimates in this case is another issue altogether). > That said, I went in and made a small change to the above code to cause the Optimizer to reject a plan if it's cost was infinity, and the result was that some queries--esp. those seen in DERBY-1205--actually ended up running more slowly. The reason is that "infinity" is obviously not an accurate cost estimate for the hash joins, and in the case of DERBY-1205 the hash joins, while expensive, still end up being cheaper than nested loop joins. So the result of "fixing" the logic with a small change ended up making the queries run more slowly. Thus more investigation is required regarding to how to best approach this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.