db-derby-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Db-derby Wiki] Update of "OLAPNullOrdering" by BryanPendleton
Date Thu, 19 Jul 2007 20:26:44 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification.

The following page has been changed by BryanPendleton:

New page:
As part of the ["OLAPOperations"] effort, we'd like to add support for null ordering.

== Rationale ==

In earlier versions of the SQL standard, the ordering of NULL values was implementation-defined.
In Derby, NULL values sort high, and always appear after all non-NULL values in ascending
order and before all non-NULL values in descending order.

The SQL 2003 standard allows the user to explicitly specify how to sort NULL values, by adding
the new NULLS FIRST or NULLS LAST specification in the ORDER BY clause.

== Syntax ==

In the SQL 2003 standard, the {{{<sort specification list>}}} has the following grammar:

<sort specification list> ::=
    <sort specification> [ { <comma> <sort specification> }... ]

<sort specification> ::=
    <sort key> [ <ordering specification> ] [ <null ordering> ]

<sort key> ::= <value expression>

<ordering specification> ::=
  | DESC

<null ordering> ::=

The new element here is the <null ordering> specification, which is optional.

If the <null ordering> is not specified, then an implementation-defined <null ordering>
is used. For Derby, this will continue to be NULLS LAST.

== Current Implementation ==

Currently, Derby always sorts NULL values as greater than non-NULL values.

This logic appears to be located in the various subclasses of {{{org.apache.derby.iapi.types.DataType}}},
such as {{{NumberDataType}}}, {{{SQLDate}}}, and {{{SQLChar}}}.

These classes all implement the {{{compare(DataValueDescriptor other)}}} method, and each
class contains logic similar to this snippet from SQLDate:

 * thisNull otherNull   return
 *  T       T           0   (this == other)
 *  F       T           -1  (this < other)
 *  T       F           1   (this > other)
if (thisNull || otherNull)
    if (!thisNull)      // otherNull must be true
        return -1;
    if (!otherNull)     // thisNull must be true
        return 1;
    return 0;

(As an aside, I wonder if there once was a time when NULL values sorted low in Derby, rather
than sorting HIGH. The reason I think this might have been true is that the JavaDoc for {{{DataValueDescriptor.compare()}}}
is incorrect, and says that null will be treated as '''less''' than all other values, and
also the code comment in NumberDataType.java is backward.)

== Proposed Changes ==

There are several parts to implementing the <null ordering> feature for Derby:

 * The new syntax must be added to the SQL grammar
 * The null ordering specification must be passed around through the compiler data structures
so that it eventually gets passed to the sorter
 * The sorter must be changed so that in addition to having a columnOrderingMap and a columnOrderingAscendingMap,
it also has a columnOrderingNullOrderingMap, where it records the NULLS FIRST or NULLS LAST
specification for each column in the sort.
 * The sorter must pass the user's choice to the DataValueDescriptor.compare() method call.
 * The DataValueDescriptor.compare(DataValueDescriptor other) method must be changed to allow
the null ordering choice to be passed in
 * All of the types which implement the compare method must be changed to use the specified
null ordering.
 * A variety of new test cases must be added to the test suites, and the existing test cases
should all be run to verify that they still pass.

View raw message