db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <derby-...@db.apache.org>
Subject [jira] Updated: (DERBY-1852) Wrong results: duplicate rows returned for nested UNIONs when they should be eliminated.
Date Thu, 14 Sep 2006 19:23:22 GMT
     [ http://issues.apache.org/jira/browse/DERBY-1852?page=all ]

A B updated DERBY-1852:

    Attachment: d1852_quick_doNOTcommit.diff

Attaching a quick patch to fix this issue.  I stumbled across this problem while investigating
DERBY-1777 and so I was able to tell what the issue is and how to fix it.  The attached patch
fixes the problem and should be correct code-wise, but I haven't added any comments nor explanations,
nor have I run derbyall yet.  So this is NOT to be committed.

But since I'm not sure how high this is on my priority list, I figured I'd post the patch
and if anyone else wants to finalize it by adding comments/explanations and a test case (could
just take the one from the description), feel free!

> Wrong results: duplicate rows returned for nested UNIONs when they should be eliminated.
> ----------------------------------------------------------------------------------------
>                 Key: DERBY-1852
>                 URL: http://issues.apache.org/jira/browse/DERBY-1852
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:,,,,,
>            Reporter: A B
>         Attachments: d1852_quick_doNOTcommit.diff
> Regarding UNIONs, the Derby documentation says that if the "ALL" keyword is not specified,
the default behavior is to remove duplicate rows from the result.   And for simple cases of
UNIONs, that is indeed the case.  However, there are certain queries with nested UNIONs where
Derby is supposed to remove duplicates from the result set, but does not.
> This is *NOT* a regression.  I have been able to reproduce the behavior in early versions
of 10.1.2 and also back to, which is the oldest client I have.
> To show the problem:
> create table t1 (i int, j int);
> create table t2 (i int, j int);
> insert into t1 values (1, 2), (2, 4), (3, 6), (4, 8), (5, 10);
> insert into t2 values (1, 2), (2, -4), (3, 6), (4, -8), (5, 10);
> insert into t2 values (3, 6), (4, 8), (3, -6), (4, -8);
> -- Following query correctly returns 13 rows; the first union
> -- does not specify "ALL" and should therefore remove
> -- duplicates.
> select * from t1 union select * from t2 union all select * from t1;
> -- But if we do a join with the same query, we end up with
> -- 19 rows instead of 13.  This is because the duplicate entries
> -- are not correctly removed.
> select * from t1 left outer join
>   (select * from t1 union select * from t2 union all select * from t1) x2
> on t1.i = x2.i;
> Just for sanity, I also ran these statements against DB2 and the result was of the last
query was indeed 13 rows, not 19.  So I think this is a case where Derby is returning incorrect

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message