db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4331) Join returns results in wrong order
Date Thu, 06 Aug 2009 16:52:15 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740149#action_12740149
] 

Mamta A. Satoor commented on DERBY-4331:
----------------------------------------

The bug behind the query below is as follows
SELECT CS.ID 
FROM --DERBY-PROPERTIES joinOrder=FIXED 
    FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH 
    , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET 
    , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID 
WHERE 
 CS.ID = FC.CHANGESET AND 
 F.ID = FC.FILE 
 ORDER BY CS.ID DESC; 

Optimizer while considering any join order collects the sorting provided by that join order.
For instance, for the query above, optimizer will first start with [F, -1, -1] as the join
order and for this join order,  it knows that it can guarantee row ordering on FILES.REPOSITORY
AND FILE.PATH. Next in the join order will be [F, FC, -1] and now available sorting is F.REPOSITORY,
F.PATH, FC.FILE and FC.CHANGESET. The important thing to notice here is that the sorting available
at any point is in the order collected by optimizer and this piece of information was not
being considered by the code removed by Mike from FromBaseTable. This behavior can be seen
when optimizer is considering the join order [F, FC, CS]. While considering this join order,
the removed code from FromBaseTable kicks in for the predicate CS.ID=FC.CHANGESET. The removed
code looks at equijoin CS.ID=FC.CHANGESET and it looks at the available ordering so far and
sees FC.CHANGESET in there and incorrectly concludes that required sorting is available on
CS.ID indirectly through FC.CHANGSET. It fails to take into account that FC.CHANGSET is in
the 4th position in the ordering. The rows returned from [F, FC, -1] are ordered F.REPOSITORY,
F.PATH, FC.FILE and FC.CHANGSET and not just FC.CHANGESET. Because the code incorrectly decides
that required sorting is available, it decides to avoid the sorting for the given query. 
The result of the query above without Mike's changes are as follows
ID 
----------- 
1 
2 
3 
2 
2 
3 
As we can see from above that FC.CHANGSET is sorted for every row selected from the outermost
table F. 
a)The first row in the result above is for the first row qualified from F
b)The next 2 rows in the result above are for the second qualified row from F
c) the 4th row in the result above is for the 3rd qualified row from F 
d)and last 2 rows in the result above are for the 4th qualified row from F. 

If the outermost optimizable F was only one row resultset, we could safely assume that the
required ordering is provided by indexes being considered for [F, FC, -1] but that is not
the case. 

In future, if we ever decide to enhance sort avoidance code to qualify more queries for sort
avoidance, we should consider the position of the sorted columns in available sorting list
of columns and then decide if the query is pre-ordered on required ordering.

> Join returns results in wrong order
> -----------------------------------
>
>                 Key: DERBY-4331
>                 URL: https://issues.apache.org/jira/browse/DERBY-4331
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.2.0, 10.6.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Mike Matrigali
>         Attachments: createDBsimpler.txt, createDBsimplerVer2.txt, DERBY4331_additionalTests_diff.txt,
derby4331_do_not_commit.diff, derby_4331_patch_2.diff, notorderby4331.zip, orderby4331.zip,
repro.sql, repro2.sql, repro2_qryplan.txt, wisconsin.diff
>
>
> In Derby 10.5.2.0, the query below started returning results in wrong order (note the
ORDER BY clause). Derby 10.5.1.1 gives the expected ordering.
> ij> SELECT CS.ID
> FROM
>     CHANGESETS CS, FILECHANGES FC, REPOSITORIES R, FILES F, AUTHORS A
> WHERE
>     R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY
= R.ID AND A.REPOSITORY = R.ID AND
>     CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND
>     A.ID = CS.AUTHOR AND
>     EXISTS (
>         SELECT 1
>         FROM FILES F2
>         WHERE
>             F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND
>             F2.PATH LIKE '/%' ESCAPE '#')
> ORDER BY CS.ID DESC;
> ID         
> -----------
> 1          
> 2          
> 3          
> 2          
> 2          
> 3          
> 6 rows selected

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message