db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4406) Wrong order when using ORDER BY on non-deterministic function
Date Sun, 28 Mar 2010 19:00:27 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850718#action_12850718
] 

Knut Anders Hatlen commented on DERBY-4406:
-------------------------------------------

My initial analysis of the case "select random() as r from t order by
random()" wasn't quite accurate. Here are some extra details I've
found when studying the SQL standard more closely:

"select random() as r from t order by random()" is not standard
SQL. When I quoted syntax rule 18 above, I omitted these sub-rules:

> 18) If an <order by clause> is specified, then:
>   a) Let OBC be the <order by clause>. Let NSK be the number of <sort
>      specification>s in OBC. For each i between 1 (one) and NSK, let Ki be
>      the <sort key> contained in the i-th <sort specification> in OBC.
>   b) Each Ki shall contain a <column reference> and shall not contain a
>      <subquery> or a <set function specification>.

Since "random()" does not contain a <column reference>, this ORDER BY
clause is not valid syntax. Adding a column reference to the sort key
makes it valid, so I think these statements use valid syntax:

i)  SELECT RANDOM()*A AS R FROM MYTABLE ORDER BY RANDOM()*A
ii) SELECT RANDOM()*A AS R, A FROM MYTABLE ORDER BY RANDOM()*A

For statement (i), the sort key contains a column reference to a
column that is not a column of T (column A), so the above quoted
syntax rule 18)d)i) comes into play. The results of that statement
should therefore be sorted on R.

For statement (ii), the opposite is true. Since column A is included
in the select list, the sort key does not contain a column reference
to a column that is not a column of T. Rule 18)d)i) does therefore not
apply, and the sort key should not be replaced by R. This, I think,
means the results of this statement should not be sorted on R.

At least, that's how I interpret the wording in the standard. I have
no idea why it has this apparent asymmetry. Others may have a
different interpretation, though.

> Wrong order when using ORDER BY on non-deterministic function
> -------------------------------------------------------------
>
>                 Key: DERBY-4406
>                 URL: https://issues.apache.org/jira/browse/DERBY-4406
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.3.0, 10.6.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: C.S. Nirmal J. Fernando
>
> If I read the SQL standard correctly, a statement such as "select random() as r from
t order by random()" should be treated as "select random() as r from t order by r". Derby
does however generate a second, hidden random() column by which the rows are ordered.
> ij> create table t(x int);
> 0 rows inserted/updated/deleted
> ij> insert into t values 1,2,3,4,5;
> 5 rows inserted/updated/deleted
> ij> -- wrong result, not ordered by r
> ij> select random() as r from t order by random();
> R                     
> ----------------------
> 0.1285512465366495    
> 0.5116860880915798    
> 0.21060042130229073   
> 0.2506706923680875    
> 0.6378857329935494    
> 5 rows selected
> ij> -- correct result, ordered by r
> ij> select random() as r from t order by r;
> R                     
> ----------------------
> 0.0749025910679918    
> 0.07694931688380491   
> 0.1724114605785414    
> 0.2268758969382877    
> 0.31900450349277965   
> 5 rows selected

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message