db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-1852) Wrong results: duplicate rows returned for nested UNIONs when they should be eliminated.
Date Mon, 26 Mar 2007 21:41:32 GMT

     [ https://issues.apache.org/jira/browse/DERBY-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

A B updated DERBY-1852:

    Attachment: d1852_v1.stat

Attaching d1852_engine_v1.patch, which has the same code changes as the original "quick" patch
except that I've added explanatory comments to the relevant files and I've run both derbyall
and suites.All with the changes (Red Hat Linux, ibm142) with no failures.

I'm also attaching d1852_tests_v1.patch, which contains two things:

  1. Some new test cases added to lang/union.sql, along with the corresponding master update.

  2. Master update for lang/predicatePushdown.out.  This turned out to be a
     pretty large diff (over 2000 lines) because the fix for this issue leads
     to a difference in the query plans generated for most of the queries.
     The difference is that we are now generating a SortResultSet where we
     didn't used to.  The reason is that Derby inserts an implicit DISTINCT
     node above all UNIONs with no "ALL", but prior to the fix for this issue
     the DistinctNode was effectively being ignored. Now that we correctly
     enforce the Distinct we are seeing a sort result set in the query
     plans, which is to be expected (we have to sort in order to eliminate

The changes are pretty small and my hope is that the code comments give ample explanation.
As I said, derbyall and suites.All both ran cleanly with these changes, so I think these changes
are ready for commit.  Review comments would be appreciated.

> Wrong results: duplicate rows returned for nested UNIONs when they should be eliminated.
> ----------------------------------------------------------------------------------------
>                 Key: DERBY-1852
>                 URL: https://issues.apache.org/jira/browse/DERBY-1852
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:,,,,
>            Reporter: A B
>         Assigned To: A B
>         Attachments: d1852_engine_v1.patch, d1852_quick_doNOTcommit.diff, d1852_tests_v1.patch,
> Regarding UNIONs, the Derby documentation says that if the "ALL" keyword is not specified,
the default behavior is to remove duplicate rows from the result.   And for simple cases of
UNIONs, that is indeed the case.  However, there are certain queries with nested UNIONs where
Derby is supposed to remove duplicates from the result set, but does not.
> This is *NOT* a regression.  I have been able to reproduce the behavior in early versions
of 10.1.2 and also back to, which is the oldest client I have.
> To show the problem:
> create table t1 (i int, j int);
> create table t2 (i int, j int);
> insert into t1 values (1, 2), (2, 4), (3, 6), (4, 8), (5, 10);
> insert into t2 values (1, 2), (2, -4), (3, 6), (4, -8), (5, 10);
> insert into t2 values (3, 6), (4, 8), (3, -6), (4, -8);
> -- Following query correctly returns 13 rows; the first union
> -- does not specify "ALL" and should therefore remove
> -- duplicates.
> select * from t1 union select * from t2 union all select * from t1;
> -- But if we do a join with the same query, we end up with
> -- 19 rows instead of 13.  This is because the duplicate entries
> -- are not correctly removed.
> select * from t1 left outer join
>   (select * from t1 union select * from t2 union all select * from t1) x2
> on t1.i = x2.i;
> Just for sanity, I also ran these statements against DB2 and the result was of the last
query was indeed 13 rows, not 19.  So I think this is a case where Derby is returning incorrect

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message