db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-4695) Internal assignment of tablenumer, columnnumber looks wrong in query tree, although no ill effects are seen.
Date Mon, 28 Jun 2010 20:17:50 GMT

     [ https://issues.apache.org/jira/browse/DERBY-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dag H. Wanvik updated DERBY-4695:
---------------------------------

    Attachment: derby-4695-fixcolumnno-1a.diff
                derby-4695-fixcolumnno-1a.stat

Uploading a patch, derby-4679-fixcolumnno-1a, which fixes the wrong column numbers, by assigning
them at the same time as the tableNumber is bound, i.e. in the fromTable classes, rather than
in ColumnReference#bindExpression. The fact that JoinNode replaces the RC, will then not affect
the actually bound columnNumber like before.

Cf the enclosed trace patch which I have used to show what happens, trace-remapping.diff.
This can be applied both with and without the patch proper.

I have ported the original DERBY-2526 query over to JoinTest#testDerby_4695 and added one
more query, so there are two queries, let's call them a and b. 

The query tree after the bind phase before and after are attached as query-a-before.log and
query-a-after.log. There have been normalized, so they can be diffed to see the net difference
in binding, which is that the column reference BVW.C1 was 1,6 before the patch and 1,2 after
the patch. 

For query a, we see the following trace of remappings done as a result of join flattening
before the patch:

remapping CR: B3.C1 before: 0,1 after 5,1 null
remapping CR: B3.C9 before: 0,2 after 5,2 null
remapping CR: B3.C5 before: 0,3 after 5,3 null
remapping CR: B3.C6 before: 0,4 after 5,4 null
remapping CR: B4.C4 before: 6,2 after 6,2 null
remapping CR: B3.C1 before: 0,1 after 0,1 null
remapping CR: B3.C9 before: 0,2 after 0,2 null
remapping CR: B3.C5 before: 0,3 after 0,3 null
remapping CR: B3.C6 before: 0,4 after 0,4 null
remapping CR: BVW.C1 before: 1,6 after 1,2 B2   <-------- wrong input, but ends up correct



and the following after the patch:

remapping CR: B3.C1 before: 0,1 after 5,1 null
remapping CR: B3.C9 before: 0,2 after 5,2 null
remapping CR: B3.C5 before: 0,3 after 5,3 null
remapping CR: B3.C6 before: 0,4 after 5,4 null
remapping CR: B4.C4 before: 6,2 after 6,2 null
remapping CR: B3.C1 before: 0,1 after 0,1 null
remapping CR: B3.C9 before: 0,2 after 0,2 null
remapping CR: B3.C5 before: 0,3 after 0,3 null
remapping CR: B3.C6 before: 0,4 after 0,4 null
remapping CR: BVW.C1 before: 1,2 after 1,2 B2  <--- correct both before and after

For query b, which is a more complicated join, I also upload similar log files, query-b-before.log
and query-b-after.log, which also only differ in their column reference BVW.C1 (several this
time, since the column in used in more places). The query looks like this:

select b3.*, bvw.c1 from b3 inner join bvw on (b3.c1 = bvw.c5) 
                            inner join b4  on (bvw.c1 = b4.c7) 
                            inner join b  on (bvw.c1 = b.c1)
                            inner join b bcorr on bvw.c1 = bcorr.c1
    where b4.c4 = 42"

which is a bit meaningless, but exercises several flattenings involving bvw.c1. This query
also gave a correct answer, but as we can see, during the intermediate steps, the binding
of RC ends up wrong (also one a la DERBY-4679, due to falling back on column name based matching:
since the (tablenumber, column number) pair was wrong at the outset, and so made the new lookup
method introduced for DERBY-4679 fail to do its job. I believe, although I haven't yet been
able to prove it, that the wrong numbering could lead to bugs:

      - wrong or missing transitive closure computation since this is based         on (tn,
cn)

      - lead to yet other wrong intermediate CRs (shown to happen thrice in the example below,
although it doesn't lead to query error here) which could then in turn confuse transitive
closure computation, as seen in DERBY-4679 (CR bound to wrong column instance in a JOIN RCL).
Note that this represents another mode of "correct t-no, wrong col-no" happening: the primary
issue here is the wrong binding in the bind phase, this second mode happens during rebinding
in the optimizer's preprocess phase. I.e. error mode one leads to error mode 2, which has
been shown to cause bugs.

Now, let us look at the differences for query b before the patch:

remapping CR: B3.C1 before: 0,1 after 9,1 null
remapping CR: B3.C9 before: 0,2 after 9,2 null
remapping CR: B3.C5 before: 0,3 after 9,3 null
remapping CR: B3.C6 before: 0,4 after 9,4 null
remapping CR: BVW.C1 before: 1,2 after 9,6 B2
remapping CR: B4.C4 before: 6,2 after 9,11 null
remapping CR: B3.C1 before: 0,1 after 7,1 null
remapping CR: B3.C9 before: 0,2 after 7,2 null
remapping CR: B3.C5 before: 0,3 after 7,3 null
remapping CR: B3.C6 before: 0,4 after 7,4 null
remapping CR: BVW.C1 before: 1,2 after 7,6 B2
remapping CR: B4.C4 before: 6,2 after 7,11 null
remapping CR: BVW.C1 before: 1,6 after 7,1 null <--- wrong!
remapping CR: B3.C1 before: 0,1 after 5,1 null
remapping CR: B3.C9 before: 0,2 after 5,2 null
remapping CR: B3.C5 before: 0,3 after 5,3 null
remapping CR: B3.C6 before: 0,4 after 5,4 null
remapping CR: BVW.C1 before: 1,2 after 5,6 B2
remapping CR: B4.C4 before: 6,2 after 6,2 null
remapping CR: BVW.C1 before: 1,6 after 5,1 null <--- wrong
remapping CR: BVW.C1 before: 1,6 after 5,1 null <--- wrong
remapping CR: B3.C1 before: 0,1 after 0,1 null
remapping CR: B3.C9 before: 0,2 after 0,2 null
remapping CR: B3.C5 before: 0,3 after 0,3 null
remapping CR: B3.C6 before: 0,4 after 0,4 null
remapping CR: BVW.C1 before: 1,2 after 1,2 B2
remapping CR: BVW.C1 before: 1,6 after 1,2 B2 <--- correct with wrong input (rescued ultimately!)
remapping CR: BVW.C1 before: 1,6 after 1,2 B2 <--- ditto
remapping CR: BVW.C1 before: 1,6 after 1,2 B2 <--- ditto

vs after the patch:

remapping CR: B3.C1 before: 0,1 after 9,1 null
remapping CR: B3.C9 before: 0,2 after 9,2 null
remapping CR: B3.C5 before: 0,3 after 9,3 null
remapping CR: B3.C6 before: 0,4 after 9,4 null
remapping CR: BVW.C1 before: 1,2 after 9,6 B2
remapping CR: B4.C4 before: 6,2 after 9,11 null
remapping CR: B3.C1 before: 0,1 after 7,1 null
remapping CR: B3.C9 before: 0,2 after 7,2 null
remapping CR: B3.C5 before: 0,3 after 7,3 null
remapping CR: B3.C6 before: 0,4 after 7,4 null
remapping CR: BVW.C1 before: 1,2 after 7,6 B2
remapping CR: B4.C4 before: 6,2 after 7,11 null
remapping CR: BVW.C1 before: 1,2 after 7,6 B2 <--- correct
remapping CR: B3.C1 before: 0,1 after 5,1 null
remapping CR: B3.C9 before: 0,2 after 5,2 null
remapping CR: B3.C5 before: 0,3 after 5,3 null
remapping CR: B3.C6 before: 0,4 after 5,4 null
remapping CR: BVW.C1 before: 1,2 after 5,6 B2
remapping CR: B4.C4 before: 6,2 after 6,2 null
remapping CR: BVW.C1 before: 1,2 after 5,6 B2 <--- correct
remapping CR: BVW.C1 before: 1,2 after 5,6 B2 <--- correct
remapping CR: B3.C1 before: 0,1 after 0,1 null
remapping CR: B3.C9 before: 0,2 after 0,2 null
remapping CR: B3.C5 before: 0,3 after 0,3 null
remapping CR: B3.C6 before: 0,4 after 0,4 null
remapping CR: BVW.C1 before: 1,2 after 1,2 B2
remapping CR: BVW.C1 before: 1,2 after 1,2 B2 <--- correct
remapping CR: BVW.C1 before: 1,2 after 1,2 B2 <--- correct
remapping CR: BVW.C1 before: 1,2 after 1,2 B2 <--- correct

As can be seen, the fact that BVW.C1 on the join condition is now correctly bound, makes the
fix for DERBY-4679 kick in in yet another instance, and return (7,6) rather than (7,1) after
the second flattening.

[ To see that (7,6) is correct, if we look inside the binding log file for query b, we see
that table 7 corresponds to the join node with join clause BVW.C1 = B4.C7. This join has a
RCL which is 12 columns long. We can see that RC (7,1) corresponds to B3.C1, which is wrong.
After the patch, we see the binding is done to (9,6), which is correct, since that points
to the correct RC in the RCL of table 7, that is a RC corresponding to RC in the join below,
which ultimately stems from the subquery column BVW.C1. ]

So, we see from the trace that for each flattening, the binding for BVW.C1 stays consistent
during the flattening phases *with* the patch, whereas before it was wrong in several places,
and also led to binding to the wrong RC in three cases, although the final flattening saved
the day since the CR is then remapped against a RCL that had no duplicates in it - i.e. the
subquery representing the view - so (1,6) finally became (1,2).

The patch ran regressions ok. While I haven't yet been able to prove that this issue could
lead to query errors, I do think it could, for the reasons cited above. I therefore recommend
we commit this patch as a defensive measure. If people think I should first be able to construct
a wrong query, I can always try again.


> Internal assignment of tablenumer, columnnumber looks wrong in query tree, although no
ill effects are seen.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4695
>                 URL: https://issues.apache.org/jira/browse/DERBY-4695
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.7.0.0
>            Reporter: Dag H. Wanvik
>            Assignee: Dag H. Wanvik
>            Priority: Minor
>         Attachments: derby-4695-fixcolumnno-1a.diff, derby-4695-fixcolumnno-1a.stat,
derby.log, why-ij-can-do-without-2nd-bind.txt, why-loj-needs-2nd-bind.txt
>
>
> While looking into derby-4679, I also looked at the query in
> derby-2526 to validate that my changes also worked for that variant of
> the problem.
> During inspection of the query trees after the bind phase, I found one
> instance in which the pair (tablenumber, columnnumber) of a column
> reference was wrong. Although it did not seem to impact the query
> result, I note it here as as something we should probably investiate
> as it may be a symptom an underlying problem, or a potential for
> future problems.
> The query looks like this:
> select b3.* from b3 join bvw on (b3.c8 = bvw.c5) join b4 on (bvw.c1 = b4.c7) where b4.c4
= 42"
> and the underlying DDL is this:
> create table b2 (c1 int, c2 int, c3 char(1), c4 int, c5 int, c6 int);
> create table b4 (c7 int, c4 int, c6 int);
> create table b3 (c8 int, c9 int, c5 int, c6 int);
> create table b (c1 int, c2 int, c3 char(1), c4 int, c5 int, c6 int);
> create view bvw (c5, c1 ,c2 ,c3 ,c4) as
> select c5, c1 ,c2 ,c3 ,c4 from b2 union
> select c5, c1 ,c2 ,c3 ,c4 from b;
> create view bvw2 (c1 ,c2 ,c3 ,c4 ,c5) as
> After the bind phase, the join clause "bvw.c1 = b4.c7" has the
> following entry for the column reference bvw.C1:
>           
>           tableNumber: 1
>           columnNumber: 6
> The problem is that the node with tablenumber 1 is bvw, which is the
> view with the subquery for the union, which has only 5 resulting
> columns, so 6 must be wrong. Although both the view participant tables
> (b, b2) both have six column, the view does not. In any case, C1 is
> column 2 in the view and column 2 in the two union selects from both b
> and b2.
> C1 is however, column 6 of the join node resulting from "select b3.*
> from b3 join bvw on (b3.c8 = bvw.c5)", but the correct table number for
> that would be 5, not 1.
> So, it would seem the table number has been bound to the bvw view's
> result set, but the column number has been bound to the innermost join
> node's result set. This looks worrying to me.
> See derby.log attached for the full dump of the query tree after the
> bind phase.
> sourceResultSet:                                    
>     org.apache.derby.impl.sql.compile.FromSubquery@12789d2
>     correlation Name: BVW
>     null
>     tableNumber 1 <------------------------------------------- Note!
>     level 0
>     resultSetNumber: 0
>     referencedTableMap: null
>     statementResultSet: false
>     resultColumns:                                      
>         org.apache.derby.impl.sql.compile.ResultColumnList@c943d1
>         indexRow: false
>         orderBySelect: 0
>         [0]:                                        
>         org.apache.derby.impl.sql.compile.ResultColumn@d3c6a3
>         ***truncated***
>         [1]:                                        
>         org.apache.derby.impl.sql.compile.ResultColumn@18352d8
>         exposedName: C1
>         name: C1
>         tableName: null
>         isDefaultColumn: false
>         wasDefaultColumn: false
>         isNameGenerated: false
>         sourceTableName: B2
>         type: INTEGER
>         columnDescriptor: null
>         isGenerated: false
>         isGeneratedForUnmatchedColumnInInsert: false
>         isGroupingColumn: false
>         isReferenced: true
>         isRedundant: false
>         virtualColumnId: 2
>         resultSetNumber: -1
>         dataTypeServices: INTEGER
>         expression:                                         
>             org.apache.derby.impl.sql.compile.VirtualColumnNode@b40ec4
>             dataTypeServices: null
>             sourceColumn:                                               
>                 org.apache.derby.impl.sql.compile.ResultColumn@1d95da8
>                 ***truncated***
>             sourceResultSet:                                                
>                 org.apache.derby.impl.sql.compile.UnionNode@14d7745
>                 ***truncated***
>         [2]:                                        
>         org.apache.derby.impl.sql.compile.ResultColumn@13576a2
>         exposedName: C2
>         name: C2
>         tableName: null
>         isDefaultColumn: false
>         wasDefaultColumn: false
>         isNameGenerated: false
>         sourceTableName: B2
>         type: INTEGER
>         columnDescriptor: null
>         isGenerated: false
>         isGeneratedForUnmatchedColumnInInsert: false
>         isGroupingColumn: false
>         isReferenced: true
>         isRedundant: false
>         virtualColumnId: 3
>         resultSetNumber: -1
>         dataTypeServices: INTEGER
>         expression:                                         
>             org.apache.derby.impl.sql.compile.VirtualColumnNode@ff8c74
>             dataTypeServices: null
>             sourceColumn:                                               
>                 org.apache.derby.impl.sql.compile.ResultColumn@61736e
>                 ***truncated***
>             sourceResultSet:                                                
>                 org.apache.derby.impl.sql.compile.UnionNode@14d7745
>                 ***truncated***
>         [3]:                                        
>         org.apache.derby.impl.sql.compile.ResultColumn@15e2ccd
>         exposedName: C3
>         name: C3
>         tableName: null
>         isDefaultColumn: false
>         wasDefaultColumn: false
>         isNameGenerated: false
>         sourceTableName: B2
>         type: CHAR(1)
>         columnDescriptor: null
>         isGenerated: false
>         isGeneratedForUnmatchedColumnInInsert: false
>         isGroupingColumn: false
>         isReferenced: true
>         isRedundant: false
>         virtualColumnId: 4
>         resultSetNumber: -1
>         dataTypeServices: CHAR(1)
>         expression:                                         
>             org.apache.derby.impl.sql.compile.VirtualColumnNode@1cf7491
>             dataTypeServices: null
>             sourceColumn:                                               
>                 org.apache.derby.impl.sql.compile.ResultColumn@11946c2
>                 ***truncated***
>             sourceResultSet:                                                
>                 org.apache.derby.impl.sql.compile.UnionNode@14d7745
>                 ***truncated***
>         [4]:    <----------------------------------------- highest column number is
5 (index is zero-based)
>         org.apache.derby.impl.sql.compile.ResultColumn@edf730
>         exposedName: C4
>         name: C4
>         tableName: null
>         isDefaultColumn: false
>         wasDefaultColumn: false
>         isNameGenerated: false
>         sourceTableName: B2
>         type: INTEGER
>         columnDescriptor: null
>         isGenerated: false
>         isGeneratedForUnmatchedColumnInInsert: false
>         isGroupingColumn: false
>         isReferenced: true
>         isRedundant: false
>         virtualColumnId: 5
>         resultSetNumber: -1
>         dataTypeServices: INTEGER
>         expression:                                         
>             org.apache.derby.impl.sql.compile.VirtualColumnNode@ff94b1
>             dataTypeServices: null
>             sourceColumn:                                               
>                 org.apache.derby.impl.sql.compile.ResultColumn@17a4989
>                 ***truncated***
>             sourceResultSet:                                                
>                 org.apache.derby.impl.sql.compile.UnionNode@14d7745
>                 ***truncated***
>     subquery:                                       
>         org.apache.derby.impl.sql.compile.UnionNode@14d7745
>         ***truncated***

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message