Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 14799 invoked from network); 6 Mar 2006 19:33:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 6 Mar 2006 19:33:23 -0000 Received: (qmail 39873 invoked by uid 500); 6 Mar 2006 19:33:20 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 39637 invoked by uid 500); 6 Mar 2006 19:33:20 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 39618 invoked by uid 99); 6 Mar 2006 19:33:20 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2006 11:33:19 -0800 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [32.97.182.141] (HELO e1.ny.us.ibm.com) (32.97.182.141) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2006 11:33:19 -0800 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e1.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id k26JWv2X019470 for ; Mon, 6 Mar 2006 14:32:57 -0500 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay04.pok.ibm.com (8.12.10/NCO/VER6.8) with ESMTP id k26JWulL223774 for ; Mon, 6 Mar 2006 14:32:57 -0500 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.13.3) with ESMTP id k26JWueW004108 for ; Mon, 6 Mar 2006 14:32:56 -0500 Received: from [127.0.0.1] (svl-arbrown.svl.ibm.com [9.30.38.112]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id k26JWsxw003919 for ; Mon, 6 Mar 2006 14:32:56 -0500 Message-ID: <440C8E65.7020505@sbcglobal.net> Date: Mon, 06 Mar 2006 11:32:53 -0800 From: Army User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.1) Gecko/20040707 X-Accept-Language: en-us, en MIME-Version: 1.0 To: derby-dev@db.apache.org Subject: Re: [OPTIMIZER] Optimizer "timeout" for subqueries? References: <44079238.90307@sbcglobal.net> <44079CF6.6060204@sbcglobal.net> <4408885B.5050202@Sourcery.Org> <44089030.3040805@sbcglobal.net> <44089EB1.2070209@Sourcery.Org> <4408B7D7.3070801@Sourcery.Org> In-Reply-To: <4408B7D7.3070801@Sourcery.Org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Satheesh Bandaram wrote: > Just remembered, during discussions with Jeffl, he mentioned another one... > Improve unnesting of subqueries with more than one table. More work. :-) One other thing to add to this list was mentioned a while ago on derby-dev here: http://mail-archives.apache.org/mod_mbox/db-derby-dev/200412.mbox/%3c41B9F482.4020900@sbcglobal.net%3e In that email Jack talked about his BackingStoreHashtable changes and the need to update Optimizer costing to account for those changes (which were committed as part of 10.1). In that email Jack says: > I would like to work on this, changing BackingStoreHashtable to spill to > disk when the hash table gets large, and changing the optimizer to > understand that large hash tables are allowed, but more costly. The changes to spill to disk were implemented for 10.1, but I don't think the optimizer was ever updated accordingly. In particular, it still checks the memory requirements for the hash join and, if it's too high, the Optimizer will skip it. I think the code that does this in the OptimizerImpl.considerCost() method: /* ** Skip this access path if it takes too much memory. ** ** NOTE: The default assumption here is that the number of rows in ** a single scan is the total number of rows divided by the number ** of outer rows. The optimizable may over-ride this assumption. ** ** NOTE: This is probably not necessary here, because we should ** get here only for nested loop joins, which don't use memory. */ if(!optimizable.memoryUsageOK( estimatedCost.rowCount() / outerCost.rowCount(), maxMemoryPerTable)) { if (optimizerTrace) { trace(SKIPPING_DUE_TO_EXCESS_MEMORY, 0, 0, 0.0, null); } return; } Note that the comment regarding "this is probably not necessary" appears to be out of date--there are situations where we can and do get to this code when considering the cost of hash joins. So as another "To-Do", I think we need to somehow update the optimizer to account for the fact that if the hash join requires a lot of memory and thus will (probably?) spill over, the cost should be adjusted accordingly--instead of just skipping the plan altogether. Army