Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 47710 invoked from network); 17 Feb 2006 01:10:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 17 Feb 2006 01:10:49 -0000 Received: (qmail 72746 invoked by uid 500); 17 Feb 2006 01:10:48 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 72722 invoked by uid 500); 17 Feb 2006 01:10:48 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 72713 invoked by uid 99); 17 Feb 2006 01:10:48 -0000 X-ASF-Spam-Status: No, hits=1.3 required=10.0 tests=SPF_FAIL X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Feb 2006 17:10:47 -0800 Received: from ajax.apache.org (ajax.apache.org [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id B2A0EDE for ; Fri, 17 Feb 2006 02:10:26 +0100 (CET) Message-ID: <1524991184.1140138626729.JavaMail.jira@ajax.apache.org> Date: Fri, 17 Feb 2006 02:10:26 +0100 (CET) From: "A B (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly. In-Reply-To: <688763290.1136837782912.JavaMail.jira@ajax.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366731 ] A B commented on DERBY-805: --------------------------- Based on the 6 steps that I've outlined in DERBY-805.html, I plan to post four separate patches for this issue, in the following order: -- Phase 1 -- The first patch will implement the changes as described in "Step 5" of the document, namely: 5 - Ensure that the best access path for a UnionNode that pushes predicates is correctly saved during optimization and correctly retrieved when it comes time to finalize the query's overall access path. The phase 1 patch shouldn't have any functional effect on the codeline; it's just an incremental step toward the complete fix. -- Phase 2 -- The second patch will add code for "Step 1", which is: 1 - Add the ability to take a predicate and scope it to a target result set so that it can be pushed to that result set. The phase 2 patch will add code to the codeline that won't actually get executed until phase 3. Thus, like the phase 1 patch, the phase 2 patch should have no functional effects on the codeline. -- Phase 3 -- The third patch will be the one that does the core of the predicate pushing/pulling. This patch will cover Steps 2, 3, 4, and 6 of the steps outlined in the document. Once this patch is committed, the changes for phase 1 and phase 2 will take effect--so phase 3 is also where I'll add tests to make sure predicates are being pushed correctly. Note that after phase 3, any join predicates which are targeted for UNIONS will _always_ be pushed down into the union (assuming it's possible to do so). This means that, as discussed in the document, there will be cases where Derby originally chose to do a Hash join using the predicate but will now (after phase 3) push the predicate and do a nested loop join. This problem will then be addressed in phase 4. -- Phase 4 -- The phase 4 patch will address any unresolved issues from the first three phases, which right now means that it will include the necessary changes to allow the optimizer to make a cost-based decision about whether or not it should push the predicates, instead of always pushing them. The vast majority of the changes will go into the code as described in DERBY-805.html, with potential variations due to review comments and/or improvements/alterations that I find to be necessary as I work. So anyone who might have the opportunity to review any of the phase 1-4 patches will hopefully find it useful to read the respective parts of the document... > Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly. > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: DERBY-805 > URL: http://issues.apache.org/jira/browse/DERBY-805 > Project: Derby > Type: Sub-task > Components: SQL > Versions: 10.1.2.0, 10.2.0.0 > Environment: generic > Reporter: Satheesh Bandaram > Assignee: A B > Fix For: 10.2.0.0 > Attachments: DERBY-805.html > > Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible. > For example, > create view V1 as select i, j from T1 union all select i,j from T2; > create view V2 as select a,b from T3 union all select a,b from T4; > insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); > For a query like > select * from V1, V2 where V1.j = V2.b and V1.i =1; > If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second. > Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified) > If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order. > For example: > create view v as select * from t1 union all select * from t2; > select * from v, t3 where v.c1 = t3.c2; > In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union. > If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed. > There's a comment in UnionNode.optimizeIt() saying: > /* RESOLVE - don't try to push predicated through for now */ > This is where I'd expect to see something for pushing predicates into the union during optimization. > BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira