db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Klebanoff <klebanoff-de...@sbcglobal.net>
Subject Re: About improvement of DERBY-134
Date Sat, 12 Mar 2005 00:34:45 GMT
Jeremy Boynes wrote:

> Jack Klebanoff wrote:
>> 1. sqlgrammar.jj. I think that creating a new method,
>> isNonReservedKeyword() to determine whether a token is a non-reserved
>> keyword or not, is a maintenance problem. Whenever we add a new
>> non-reserved keyword we must add it to the list of tokens, to
>> nonReservedKeyword(), and now to isNonReservedKeyword(). Having to add
>> it in two places is difficult enough to discover or remember. If we need
>> isNonReservedKeyword then we should find a way of combining
>> nonReservedKeyword and isNonReservedKeyword so that only one of them
>> keeps the list of non-reserved key words.
>> It is not necessary for the parser to recognize 3 cases of ORDER BY sort
>> key type. A column name is just one kind of <expression>. If the parser
>> treats it as an expression we should still get the right ordering. I
>> think that it would better if the parser did so. The OrderByColumn class
>> can special case a simple column reference expression, as an
>> optimization. This considerably simplifies parsing sort keys.
>> The only sort key type that has to be handled specially is that of an
>> integer constant. That specifies one of the select list columns as the
>> sort key. This case can be recognized in the parser, as is done in the
>> patch, or it can be recognized in OrderByColumn. In this alternative the
>> parser always creates OrderByColumn nodes with the sort key given by an
>> expression (a ValueNode). At bind time OrderByColumn can determine
>> whether the sort key expression is an integer constant, and if so treat
>> it as a column position.
>> The two alternatives differ in the way that they treat constant integer
>> expressions like "ORDER BY 2-1". The patch orders the rows by the
>> constant 1, which is not usefull. With the patch "ORDER BY 2-1 ASC" and
>> "ORDER BY 2-1 DESC" produce the same ordering. If OrderByColumn treated
>> an integer constant sort key expression as a result column position then
>> "ORDER BY 2-1" would cause the rows to be ordered on the first result
>> column, which I think is more usefull.
> From the SQL spec, the <sort-key> is simply a value expression 
> evaluated for each row. Hence "ORDER BY 2-1" means that all rows are 
> peers and the ordering is implementation-dependent.
> I think allowing the value-expressions to evaluate to column position 
> is potentially confusing. For example, suppose you have "ORDER BY ?" - 
> do we order by the supplied value (which would be the same for all 
> rows making the actual ordering implementation-dependent) or do we 
> order by the column position (with the potential need to re-select 
> indexes used to assist in ordering).
> I believe it is simple and SQL-compliant to have very simple rules here:
> 1) if all <sort-key>s are integer literals, then sort by the column
>    position they imply. This is SQL-compliant as the comparison criteria
>    for each row will be the same, making all rows peers and the actual
>    order implementation dependent (we just define ours as using the
>    appropriate column value).
> 2) if any <sort-key> is not an integer literal then we treat all of them
>    as <value expression> and compare rows accordingly.
> -- 
> Jeremy
I think that most people would be surprised to find that "ORDER BY 1 + 
1" behaves differently than "ORDER BY 2". I also think that it is 
simpler for us to treat all integer constants the same. It is hard to 
get the parser to distinguish between a literal constant and a complex 
expression that starts with an integer literal. That is why Tomohito 
Nakayama needed to use complex lookahead in his parser change.

I think that it is better to distinguish between integer constants and 
other expressions in the OrderByColumn class. ValueNodes have 
isConstantExpression(), getTypeID(), and getConstantValueAsObject() 
methods that OrderByColumn can use to see if the expression is an 
integer constant and, if so, what the value is. OrderByColumn can test 
whether the order by expression is an instanceOf ColumnReference or 
ResultColumn to special case a simple column reference. This simplifies 
the parser significantly and causes us to treat "ORDER BY 1 + 1" the 
same as "ORDER BY 2".

I don't feel strongly enough about this to block a patch that is well 
tested and otherwise works correctly.

Jack Klebanoff

View raw message