db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4406) Wrong order when using ORDER BY on non-deterministic function
Date Fri, 02 Apr 2010 23:51:27 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853009#action_12853009

Bryan Pendleton commented on DERBY-4406:

Hi Nirmal,

I think you will find that the handling of random() is fairly tricky,
because I think it is a Derby-specific behavior in which the SQL
statement gets compiled into something which is able to call
java.lang.Math.random() at run time, via some special Derby-Java
glue code.

I think that the object in question may be a "UserType", which is
a class that knows how to invoke user-defined Java code that
has been defined as an extension data type, which is what random() is.

I suspect it may be easier to get an understanding of the data
structures if you first have a close look at simpler situations,
as you have been doing so far.

In particular, I think you could probably work up to the problem like this:

 1) select x from t order by x
 2) select x, x*2 from t order by x*2
 3) select random() from t order by random()

The goal in all 3 cases, I believe, is to enhance Derby so that it
can recognize that the OrderByColumn is referencing an expression
which is equivalent to a ResultColumn that is already present in
the SELECT, and so we don't need to duplicate the processing
for the OrderByColumn at runtime, but can rather arrange to have
the OrderByColumn directly reference the ResultColumn.

Once we have arranged for the OrderByColumn to directly reference
the ResultColumn, the original script (select random() order by random())
will be solved, because at runtime there won't be two random()s,
only one).

However, note that sometimes we *can't* have the OrderByColumn
directly reference the ResultColumn, because we will see legitimate
statements like:

  select name from employee order by id

In such a case, the two columns are distinct and have to be processed separately.

> Wrong order when using ORDER BY on non-deterministic function
> -------------------------------------------------------------
>                 Key: DERBY-4406
>                 URL: https://issues.apache.org/jira/browse/DERBY-4406
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:,
>            Reporter: Knut Anders Hatlen
>            Assignee: C.S. Nirmal J. Fernando
> If I read the SQL standard correctly, a statement such as "select random() as r from
t order by random()" should be treated as "select random() as r from t order by r". Derby
does however generate a second, hidden random() column by which the rows are ordered.
> ij> create table t(x int);
> 0 rows inserted/updated/deleted
> ij> insert into t values 1,2,3,4,5;
> 5 rows inserted/updated/deleted
> ij> -- wrong result, not ordered by r
> ij> select random() as r from t order by random();
> R                     
> ----------------------
> 0.1285512465366495    
> 0.5116860880915798    
> 0.21060042130229073   
> 0.2506706923680875    
> 0.6378857329935494    
> 5 rows selected
> ij> -- correct result, ordered by r
> ij> select random() as r from t order by r;
> R                     
> ----------------------
> 0.0749025910679918    
> 0.07694931688380491   
> 0.1724114605785414    
> 0.2268758969382877    
> 0.31900450349277965   
> 5 rows selected

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message