db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <derby-...@db.apache.org>
Subject [jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
Date Thu, 23 Feb 2006 02:26:15 GMT
     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase1_v3.patch
                d805_phase1_v3.stat

Attaching an updated patch d805_phase1_v3.patch to address the issue in my last comment. 
The differences between d805_phase1_v3.patch and d805_phase1_v2.patch are summarized here:

1) There was a slight off-by-one error in the optimizer JUMP logic that was causing it to
pull Optimizables prematurely.  This error is what led me to realize that phase1_v2 was incomplete--however,
the JUMP error _is_ still an error, so phas1_v3 fixes it (one-line change).

2) phase1_v3 adds logic to only reload best plans when required, instead of doing it every
single time we pull (which is what phase1_v2 did).  Since plan "reload" can be relatively
expensive--especially for deeply-nested subqueries--it's better to only do it when it's required.

3) phase1_v3 adds logic to remember nested OptimizerImpl "best join orders" in addition to
nested Optimizable "best access paths".  The join orders have to saved with respect to outer
queries, just like the access paths, but phase1_v2 did not do that.  phase1_v3 does.

I ran derbyall on Red Hat Linux with IBM 1.4 and saw one diff in predicatesIntoViews that
matches the diff from DERBY-407.  In that issue the diff was occuring with slower machines
but not with faster machines.  With my phase1_v3.patch the diff occurs consistently regardless
of the speed of my CPU, and I believe the diff is correct. So I've included a master update
for that test as part of d805_phase1_v3.patch.

If anyone has time to review, I'd be grateful...

**NOTE to commiters: There are now two patches pending for this issue: d805_phase1_v3.patch
and d805_phase2_v1.patch.  The patches are independent of each other and can be committed
in either order.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar
(single table) predicate pushdown. Adding join predicate push down could improve performance
significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch,
d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat,
d805_phase2_v1.patch, d805_phase2_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs.
While this improves performance for one set of queries, ability to push join-predicates further
improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix
for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index.
By pushing join predicate, Derby would be able to use the index and improve performance. Some
of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to
about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying
parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins
into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into
the union and optimized there, but if t3 is the inner table the qualification can't be pushed
into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible
to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during
optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to
understand and debug, so maybe it's best to only handle the simple cases and do it during
preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message