db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (DERBY-4679) Several left outer joins causes unstable query with incorrect results
Date Wed, 23 Jun 2010 18:23:51 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879934#action_12879934
] 

Dag H. Wanvik edited comment on DERBY-4679 at 6/23/10 2:23 PM:
---------------------------------------------------------------

I did some more research to understand the limitations of the approach
taken by the fix in this issue, i.e. locating the result column based
on the (tablenumber, columnnumber) pair. It turns out it fails to work
also when the column reference is to a subquery, which can be
flattened before we get the the join flattening, so the table number
of the CR is no longer to be found, and we fall back on the "buggy" approach of
looking for the column name described above.

A slight reformulation of the original problem query makes the bug
reappear:

    "select distinct t1.ITEM_ID, t1.state_id, t1.JZ_DISCRIMINATOR from " +
    "((((((select * from ABSTRACT_INSTANCE z1 where z1.JZ_DISCRIMINATOR = 238) t1 " +
    "      left outer join LAB_RESOURCE_OPERATINGSYSTEM j1 on (t1.ITEM_ID = j1.JZ_PARENT_ID))
" +
    "     left outer join ABSTRACT_INSTANCE t2 on (j1.ITEM_ID = t2.ITEM_ID)) " +
    "    left outer join OPERATING_SYSTEM_SOFTWARE_INSTALL j2 on (t2.ITEM_ID = j2.JZ_PARENT_ID))"
+
    "   left outer join (select * from ABSTRACT_INSTANCE) tCorr on (j2.ITEM_ID = tCorr.ITEM_ID)
" +
    "  inner join FAMILY t5 on (t2.FAMILY_ITEM_ID = t5.ITEM_ID)) " +
    " inner join FAMILY t7 on (t1.FAMILY_ITEM_ID = t7.ITEM_ID)) " +
    "where (tCorr.FAMILY_ITEM_ID IN('_5VetVWTeEd-Q8aOqWJPEIQ') and " +
    "      (t5.ROOT_ITEM_ID = '_5ZDlwWTeEd-Q8aOqWJPEIQ') and " +
    "      (t7.ROOT_ITEM_ID ='_5nN9mmTeEd-Q8aOqWJPEIQ') and " +
    "      (t1.VISIBILITY = 0))");

Note that now, instead of t3, we now have a subquery, tCorr.

However, during my research on DERBY-4695, I reported that skipping
the 2nd bind block in JoinNode (unless we have an OUTER JOIN), made
the observed numbering inconsistency go away.

Well, it turns out that skipping the second bind for inner joins makes
the above query variant work, too, as well as the original problem
query in this issue, *without* the fix I have already committed for
this issue!

It could be that this is the root problem. Since it also solves the
above variant, and removes the wrong numbering seen, it seems a better
fix anyway, if regressions tests still hold up. We'll see that soon.

[Edit: 2010-06-24:
It is not the root cause, because the second bind is needed, cf
conclusions in DERBY-4695. The approach is broken, though, for two
reasons:

        a) it sometimes leads to wrong column number (see DERBY-4695).

        b) we can't rebind correctly when subqueries flattened
        (cf. the above query variant) because the node (and the table
        number) of the column reference representing the subquery
        column is gone, so the approach taken here (i.e. to match the
        result column using table number and column number of the CR)
        fails to locate a matching result column, forcing us back to
        the old approach, which fails due to column names not being
        unique.
]


      was (Author: dagw):
    I did some more research to understand the limitations of the approach
taken by the fix in this issue, i.e. locating the result column based
on the (tablenumber, columnnumber) pair. It turns out it fails to work
also when the column reference is to a subquery, which can be
flattened before we get the the join flattening, so the table number
of the CR is no longer to be found, and we fall back on the "buggy" approach of
looking for the column name described above.

A slight reformulation of the original problem query makes the bug
reappear:

    "select distinct t1.ITEM_ID, t1.state_id, t1.JZ_DISCRIMINATOR from " +
    "((((((select * from ABSTRACT_INSTANCE z1 where z1.JZ_DISCRIMINATOR = 238) t1 " +
    "      left outer join LAB_RESOURCE_OPERATINGSYSTEM j1 on (t1.ITEM_ID = j1.JZ_PARENT_ID))
" +
    "     left outer join ABSTRACT_INSTANCE t2 on (j1.ITEM_ID = t2.ITEM_ID)) " +
    "    left outer join OPERATING_SYSTEM_SOFTWARE_INSTALL j2 on (t2.ITEM_ID = j2.JZ_PARENT_ID))"
+
    "   left outer join (select * from ABSTRACT_INSTANCE) tCorr on (j2.ITEM_ID = tCorr.ITEM_ID)
" +
    "  inner join FAMILY t5 on (t2.FAMILY_ITEM_ID = t5.ITEM_ID)) " +
    " inner join FAMILY t7 on (t1.FAMILY_ITEM_ID = t7.ITEM_ID)) " +
    "where (tCorr.FAMILY_ITEM_ID IN('_5VetVWTeEd-Q8aOqWJPEIQ') and " +
    "      (t5.ROOT_ITEM_ID = '_5ZDlwWTeEd-Q8aOqWJPEIQ') and " +
    "      (t7.ROOT_ITEM_ID ='_5nN9mmTeEd-Q8aOqWJPEIQ') and " +
    "      (t1.VISIBILITY = 0))");

Note that now, instead of t3, we now have a subquery, tCorr.

However, during my research on DERBY-4695, I reported that skipping
the 2nd bind block in JoinNode (unless we have an OUTER JOIN), made
the observed numbering inconsistency go away.

Well, it turns out that skipping the second bind for inner joins makes
the above query variant work, too, as well as the original problem
query in this issue, *without* the fix I have already committed for
this issue!

It could be that this is the root problem. Since it also solves the
above variant, and removes the wrong numbering seen, it seems a better
fix anyway, if regressions tests still hold up. We'll see that soon.

  
> Several left outer joins causes unstable query with incorrect results
> ---------------------------------------------------------------------
>
>                 Key: DERBY-4679
>                 URL: https://issues.apache.org/jira/browse/DERBY-4679
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.1, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.2.1.6, 10.2.2.0, 10.3.1.4,
10.3.2.1, 10.3.3.0, 10.4.1.3, 10.4.2.0, 10.5.1.1, 10.5.2.0, 10.5.3.0, 10.6.1.0
>         Environment: ------------------ Java Information ------------------
> Java Version:    1.5.0
> Java Vendor:     IBM Corporation
> Java home:       C:\jazz-rtc-2.0.0.2\client\eclipse\jdk\jre
> Java classpath:  C:\Progra~1\Derby\Derby10.5.3/lib/derby.jar;C:\Progra~1\Derby\Derby10.5.3/lib/derbynet.jar;C:\Progra~1\Derby\Derby10.5.3/lib/derbyclient.jar;C:\Progra~1\Derby\Derby10.5.3/lib/derbytools.jar
> OS name:         Windows XP
> OS architecture: x86
> OS version:      5.1 build 2600 Service Pack 3
> Java user name:  dirichar
> Java user home:  C:\Documents and Settings\dirichar
> Java user dir:   C:\Program Files\Derby\Derby10.5.3\bin
> java.specification.name: Java Platform API Specification
> java.specification.version: 1.5
> --------- Derby Information --------
> JRE - JDBC: J2SE 5.0 - JDBC 3.0
> [C:\Program Files\Derby\Derby10.5.3\lib\derby.jar] 10.5.3.0 - (802917)
> [C:\Program Files\Derby\Derby10.5.3\lib\derbytools.jar] 10.5.3.0 - (802917)
> [C:\Program Files\Derby\Derby10.5.3\lib\derbynet.jar] 10.5.3.0 - (802917)
> [C:\Program Files\Derby\Derby10.5.3\lib\derbyclient.jar] 10.5.3.0 - (802917)
> ------------------------------------------------------
> ----------------- Locale Information -----------------
> Current Locale :  [English/United States [en_US]]
> Found support for locale: [cs]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [de_DE]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [es]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [fr]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [hu]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [it]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [ja_JP]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [ko_KR]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [pl]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [pt_BR]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [ru]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [zh_CN]
> 	 version: 10.5.3.0 - (802917)
> Found support for locale: [zh_TW]
> 	 version: 10.5.3.0 - (802917)
> ------------------------------------------------------
>            Reporter: David Richards
>            Assignee: Dag H. Wanvik
>         Attachments: db.sql, derby-4679-followup.diff, derby-4679-followup.stat, derby-4679a.diff,
derby-4679a.stat, derby-4679b.diff, derby-4679b.stat, drawing.txt, drawing.txt, equal-after-preprocess.log,
Foo.java, in-after-preprocess.log, ins-after-preprocess.log, sample.sql, show.diff, trace-equal.log,
trace-exec-plan.log, trace-in.log
>
>
> select distinct 
>     t1.ITEM_ID, 
>     t1.STATE_ID, 
>     t1.JZ_DISCRIMINATOR 
> from (
>     select * 
>     from 
>         LM.ABSTRACT_INSTANCE z1 
>     where 
>         z1.JZ_DISCRIMINATOR = 238
> ) t1 left outer join 
>     LM.LAB_RESOURCE_OPERATINGSYSTEM j1 on (t1.ITEM_ID = j1.JZ_PARENT_ID) left outer join

>          LM.ABSTRACT_INSTANCE t2 on (j1.ITEM_ID = t2.ITEM_ID) left outer join 
>              LM.OPERATING_SYSTEM_SOFTWARE_INSTALL j2 on (t2.ITEM_ID = j2.JZ_PARENT_ID)
left outer join 
>                  LM.ABSTRACT_INSTANCE t3 on (j2.ITEM_ID = t3.ITEM_ID) left outer join

>                      LM.FAMILY t5 on (t2.FAMILY_ITEM_ID = t5.ITEM_ID) left outer join

>                          LM.FAMILY t7 on (t1.FAMILY_ITEM_ID = t7.ITEM_ID) 
> where (
>     t3.FAMILY_ITEM_ID  in('_5VetVWTeEd-Q8aOqWJPEIQ') and 
>     (t5.ROOT_ITEM_ID = '_5ZDlwWTeEd-Q8aOqWJPEIQ') and 
>     (t7.ROOT_ITEM_ID = '_5nN9mmTeEd-Q8aOqWJPEIQ') and 
>     (t1.VISIBILITY = 0)
> )
> The above query returns no results despite the fact that the database contains results
that match the query.  
> Slight modifications to the query that shouldn't change the outcome cause it to return
the expected results.  For example: changing 
> "t3.FAMILY_ITEM_ID  in('_5VetVWTeEd-Q8aOqWJPEIQ')" 
> to 
> "t3.FAMILY_ITEM_ID  in('_5VetVWTeEd-Q8aOqWJPEIQ', 'blah')" 
> or 
> "t3.FAMILY_ITEM_ID  = '_5VetVWTeEd-Q8aOqWJPEIQ'"
> or removing  
> "(t5.ROOT_ITEM_ID = '_5ZDlwWTeEd-Q8aOqWJPEIQ') and "
> despite the fact that the results show that t5.ROOT_ITEM_ID is equal to the string above.
> Note that there is no error message associated with the incorrect query, it just returns
zero rows.
> The query below doesn't use join statements and works as expected.
> select distinct 
>    t1.item_id, 
>    t1.state_id, 
>    t1.jz_discriminator
> from (
>         select * 
>         from 
>             lm.abstract_instance z1 
>         where 
>             z1.jz_discriminator = 238
>     ) t1,
>     lm.lab_resource_operatingsystem j1,
>     lm.abstract_instance t3,
>     lm.operating_system_software_install j2,
>     lm.abstract_instance t2,
>     lm.family t5,
>     lm.family t7
> where
>     t1.item_id = j1.jz_parent_id and
>     j1.item_id = t2.item_id and
>     t2.item_id = j2.jz_parent_id and
>     j2.item_id = t3.item_id and
>     t2.family_item_id = t5.item_id and
>     t1.family_item_id = t7.item_id and
>     t3.family_item_id in ('_5VetVWTeEd-Q8aOqWJPEIQ') and
>     t5.root_item_id = '_5ZDlwWTeEd-Q8aOqWJPEIQ' and
>     t7.root_item_id = '_5nN9mmTeEd-Q8aOqWJPEIQ' and
>     t1.visibility = 0;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message