db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4712) Complex nested joins problems
Date Wed, 08 Sep 2010 21:17:33 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907434#action_12907434

Dag H. Wanvik commented on DERBY-4712:

NPE preliminary analysis:

Referring to the drawing in the uploading document "drawing.txt", note
the following:

In the simpler query that works, after optimization, there is a PRN
sitting above the JoinNode (labeled "JN@60") which holds the predicate
X0=X2. In the bad query, there is no such PRN, the JoinNode (labeled
JN@45) sits directly underneath the Left Outer Join Node (labeled

This seemed puzzling, since the simpler query is the exact left side of
the bad query.

Tracing the creation of PRNs, I found that the PRN in the simpler
query gets created in TableOperatorNode (superclass of JoinNode) *iff*
"isFlattenableJoinNode()" returns false for the JoinNode,
cf. TableOperatorNode#preprocess.

Now, by tracing this in the simple, working example, I see that
isFlattenableJoinNode does in deed return false. However, in the bad
query, it returns true! Why should that be? 

It turns out the for the working query, FromList#preprocess calls
transformOuterJoins for each table in the from list. If the top node
in the from list is a LOJ, which it is in the simplified, working
query, the method called will be
HalfOuterJoinNode#transformOuterJoins. Since there is no WHERE
predicate, the argument "predicateTree" will be null, and
transformOuterJoins will return "this", but only after calling


on its child result sets. This accounts for the fact that in the good
query, the PRN gets inserted.

Now, in the bad query, again FromList#preprocess calls
transformOuterJoins, but this time on the top plain JoinNode. Also
here, we see the method return "this" is the predicateTree is null (no
WHERE clause), but it makes no attempt to propagate this information
down to any inferior outer join nodes. This accounts for the fact that
"LOJ@33" in the drawing never gets to call notFlattenableJoin on
"JN@45", and hence it never gets a PRN above it.

By adding the following lines to JoinNode#transformOuterJoins, the NPE
goes away for all the sample queries in DERBY4712 (I have not yet
verified their correctness though):
if (predicateTree == null) {
   // Make sure any nested outer joins know we are non flattenable, too,
   // since they inform their left and right sides. If result sets are
   // not outer joins, these are no-ops. Cf. DERBY-4712.
   ((FromTable) leftResultSet).transformOuterJoins(null, numTables);
   ((FromTable) rightResultSet).transformOuterJoins(null, numTables);
   return this;

Possibly, we need to add these two extra lines to the same place in HalfOuterJoinNode, too.
I'll see.

So, in summary, I think that the NPE results from an query tree
inconsistency in that the missing PRN indicates that the JoinNode is
flattenable whereas it is not really, and the join predicate gets
pushed down to far.

> Complex nested joins problems
> -----------------------------
>                 Key: DERBY-4712
>                 URL: https://issues.apache.org/jira/browse/DERBY-4712
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:
>            Reporter: Thomas Mueller
>            Assignee: Dag H. Wanvik
>            Priority: Minor
>         Attachments: assert-bind-opt-trees.log, assert-bind-opt-trees.txt, derby.log.analyzed,
derby.log.simpler, drawing.txt
> I ran a randomized test for nested joins against PostgreSQL, Derby, and H2, and found
some problems with Derby. The queries below throw NullPointerExceptions; the last query an
assertion. The test case is here: http://code.google.com/p/h2database/source/browse/trunk/h2/src/test/org/h2/test/db/TestNestedJoins.java
. There are probably shorter queries that are problematic, but I can't test it because Derby
closes the connection after the assertion.
> create table t0(x int);
> create table t1(x int);
> create table t2(x int);
> create table t3(x int);
> create table t4(x int);
> insert into t4 values(0);
> insert into t4 values(1);
> insert into t4 values(2);
> insert into t4 values(3);
> create table t5(x int);
> insert into t5 values(0);
> insert into t5 values(1);
> insert into t5 values(2);
> insert into t5 values(3);
> insert into t5 values(4);
> create table t6(x int);
> insert into t6 values(0);
> insert into t6 values(1);
> insert into t6 values(2);
> insert into t6 values(3);
> insert into t6 values(4);
> insert into t6 values(5);
> create table t7(x int);
> insert into t7 values(0);
> insert into t7 values(1);
> insert into t7 values(2);
> insert into t7 values(3);
> insert into t7 values(4);
> insert into t7 values(5);
> insert into t7 values(6);
> create table t8(x int);
> insert into t8 values(0);
> insert into t8 values(1);
> insert into t8 values(2);
> insert into t8 values(3);
> insert into t8 values(4);
> insert into t8 values(5);
> insert into t8 values(6);
> insert into t8 values(7);
> create table t9(x int);
> insert into t9 values(0);
> insert into t9 values(1);
> insert into t9 values(2);
> insert into t9 values(3);
> insert into t9 values(4);
> insert into t9 values(5);
> insert into t9 values(6);
> insert into t9 values(7);
> insert into t9 values(8);
> insert into t0 values(1);
> insert into t1 values(2);
> insert into t0 values(3);
> insert into t1 values(3);
> insert into t2 values(4);
> insert into t0 values(5);
> insert into t2 values(5);
> insert into t1 values(6);
> insert into t2 values(6);
> insert into t0 values(7);
> insert into t1 values(7);
> insert into t2 values(7);
> insert into t3 values(8);
> insert into t0 values(9);
> insert into t3 values(9);
> insert into t1 values(10);
> insert into t3 values(10);
> insert into t0 values(11);
> insert into t1 values(11);
> insert into t3 values(11);
> insert into t2 values(12);
> insert into t3 values(12);
> insert into t0 values(13);
> insert into t2 values(13);
> insert into t3 values(13);
> insert into t1 values(14);
> insert into t2 values(14);
> insert into t3 values(14);
> insert into t0 values(15);
> insert into t1 values(15);
> insert into t2 values(15);
> insert into t3 values(15);
> select t0.x , t1.x , t2.x , t3.x , t4.x , t5.x , t6.x , t7.x , t8.x from (((t0 inner
join ((t1 right outer join (t2 inner join t3 on t2.x = t3.x ) on t1.x = t2.x ) left outer
join (t4 inner join t5 on t4.x = t5.x ) on t1.x = t4.x ) on t0.x = t2.x ) left outer join
(t6 inner join t7 on t6.x = t7.x ) on t1.x = t6.x ) inner join t8 on t5.x = t8.x );
> select t0.x , t1.x , t2.x , t3.x , t4.x , t5.x , t6.x , t7.x from ((t0 right outer join
t1 on t0.x = t1.x ) inner join (((t2 inner join (t3 left outer join t4 on t3.x = t4.x ) on
t2.x = t3.x ) right outer join t5 on t2.x = t5.x ) left outer join (t6 inner join t7 on t6.x
= t7.x ) on t4.x = t6.x ) on t0.x = t5.x );
> select t0.x , t1.x , t2.x , t3.x , t4.x , t5.x , t6.x , t7.x from ((((t0 left outer join
t1 on t0.x = t1.x ) right outer join t2 on t0.x = t2.x ) right outer join t3 on t0.x = t3.x
) inner join ((t4 inner join t5 on t4.x = t5.x ) right outer join (t6 right outer join t7
on t6.x = t7.x ) on t4.x = t6.x ) on t1.x = t4.x );
> select t0.x , t1.x , t2.x , t3.x , t4.x , t5.x from (((t0 inner join t1 on t0.x = t1.x
) right outer join (t2 right outer join t3 on t2.x = t3.x ) on t0.x = t2.x ) inner join (t4
left outer join t5 on t4.x = t5.x ) on t1.x = t4.x );
> select t0.x , t1.x , t2.x , t3.x , t4.x , t5.x , t6.x from ((t0 right outer join (t1
right outer join (t2 left outer join (t3 left outer join t4 on t3.x = t4.x ) on t2.x = t3.x
) on t1.x = t3.x ) on t0.x = t1.x ) left outer join (t5 inner join t6 on t5.x = t6.x ) on
t2.x = t5.x );

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message