db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <derby-...@db.apache.org>
Subject [jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Date Fri, 26 Aug 2005 11:40:41 GMT
     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504.diff

I have attached a patch which fixes the optimization bug by checking that a subquery has the
same columns as the top-level query before pushing the duplicate elimination into the subquery.
This patch supersedes the previously submitted patch.

This patch does not remove the optimization of SELECT DISTINCT in other cases than those that
were incorrectly optimized in the original Derby code.

I have run derbyall successfully with the exception of two tests:
   - lang/groupBy.sql fails because of DERBY-519
   - store/encryptionKey.sql fails, but I have seen that others have had trouble with this
test too, so I believe it is not related to my patch

I will submit tests for this bug later.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff
>
> When one performs a select distinct on a table generated by a subselect, there sometimes
are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from
the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message