db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2998) Add support for ROW_NUMBER() window function
Date Thu, 28 Feb 2008 23:43:52 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573520#action_12573520
] 

A B commented on DERBY-2998:
----------------------------

> the two last join queries both fail due to failure to classify some of the PRNs as noops.

The code to check for "no-op" ProjectRestrictNodes depends, among other things, on object
equality between two ResultColumn nodes.  See ResultColumnList.nopProjection().  I think the
manipulation that is done in SelectNode.genProjectRestrict() re-arranges the tree and creates
new nodes such that the object equality fails.  And even if the object equality wasn't an
issue, I think re-arragement of the tree would cause the method to return false anyway.  Which
corresponds with your findings on "redundant" result column lists.

I did some tracing for the queries that are still failing with patch 18.  The failure occurs
during code generation, when we try to generate the predicate "x2.c4 = t1.i" as a qualifier
on table T1.  When it comes time to generate the left operand, which points to a table that
is BENEATH the WindowNode that was generated for the ROW_NUMBER(), we can't find the target
result set number for the operand "X2.C4".  In looking at the predicate contents at the point
of failure, I could see that the target result set number was buried down inside of a ResultColumn-to-VirtualColumn
chain rooted at X2.C4.  Again, this matches your findings regarding "redundant" result columns.

I then replaced ROW_NUMBER() with a simple expression ("i+j") and stopped at the same point;
in this case the target result set number is readily available (i.e. it's not buried) and
the query passes.

On a whim, I then removed the following code from SelectNode:

  Code fragment "A":

   /*
    * The this.resultColumns object is shared with any PRNs above this
    * SelectNode in case this is a subquery. The following RCL
    * modifications cause problems in the column mapping of the upper
    * PRN unless we start off by making a copy for ourselves.
    */
    setResultColumns(resultColumns.copyListAndObjects());

When I took that out, the target result set number became readily available for the join queries
and the queries executed without error, which was good.  Of course, the queries returned the
wrong results, and several other ROW_NUMBER() queries which used to work started failing again.
 So I tried to figure out why the code was added in the first place, as the comments didn't
quite make it clear for me.

As far as I can tell, the scenario that prompted the addition of the above code is something
like:

  select * from
    (select row_number() over(), 'HMM' from t1) x(a,c)
  where a > 2

which leads to a (simplified) tree that looks like:

  ... PRN0(a,c) -> SELECT(row_number(), 'HMM') -> ...

In this tree, PRN0's RCL is a different object than SELECT's RCL, and the columns in PRN0's
RCL have VirtualColumnNodes that do not appear in SELECT's RCL. But underneath it all, both
RCLs ultimately point to the same underlying ResultColumn objects.  Thus any changes we make
to the underlying ResultColumns in SELECT's RCL will be "shared" by PRN0, meaning that the
ResultColumns beneath PRN0's RCL will see the changes, as well.

>From what I can gather, the changes that we are talking about here deal with virtual column
ids.  During SelectNode.genProjectRestrict() we remove the WindowFunctionColumns from SELECT's
ResultColumnList and then adjust the virtual column id's accordingly.  So we get:

  ... PRN0(a,c) -> SELECT('HMM') -> ...

In the absence of code fragment "A", the result column for "HMM" now has a virtual column
id of "1" (instead of "2").  The fact that we changed the virtual column id from "2" to "1"
means that the result column beneath "C" in PRN0's ResultColumnList *also* now has a virtual
column id of "1".  That indicates that "C" should pull its value from the 1st column in its
child's RCL. This is where things start to go wrong...

When we finish with SelectNode.genProjectRestrict(), our simplified tree looks like:

  PRN0(a,c)
    -> PRNDUMMY(row_number(), 'HMM')
      -> WindowNode(row_number(), 'HMM')
        -> PRN1('HMM') // this now has virtual column id of "1"
          -> ...

Note that PRNDUMMY will be a "no-op" project restrict and thus will not actually be generated.
 So PRN0's child will end up being WindowNode, which has TWO columns.  But PRN0.C still has
a virtual column id of "1", meaning that it should pull its value from the first column of
PRN0's child.  PRN0's child is WindowNode, and the first column in WindowNode is row_number()--not
'HMM'.  So "C" is now pointing to the wrong place.

At code generation time, when we try to generate the projection columns for PRN0, we'll take
a look at it's virtual column ids and generate the project "mapping" array from those.  In
ProjectRestrictNode we find the following within "generateMinion()":

        // Map the result columns to the source columns
        int[] mapArray = resultColumns.mapSourceColumns();

The method "mapSourceColumns()" simply walks through the RCL and plugs in the virtual column
ids for each of the columns.  For PRN0, since "A" and "C" both have a virtual column id of
"1", we'll get back an array whose contents are "[1, 1]".

That array is then passed to the execution-time ProjectRestrictResultSet and is used for extracting
columns from the underlying result set, which will be a WindowResultSet in this case.  So
we'll get rows from WindowResultSet that have the values (1, 'HMM'), (2, 'HMM'), etc. That
part is fine.  But then the ProjectRestrictResultSet for PRN0 will use the "mapping" array
to pull columns out of each row from WindowResultSet.  Since _both_ elements in the array
are "1", we extract out the 1st column and put it into _both_ places.  So the final rows will
end up as (1, 1), (2, 2), etc. which is wrong.

All of that said, code fragment "A" resolved this problem by cloning SELECT's result column
list and making all modifications on the clone.  So when the virtual column id for "HMM" changes
from "2" to "1", PRN0 doesn't see the change and therefore code generation and execution run
fine.

But as I mentioned at the start of this comment, the cloning leads to other problems--such
as the queries that are still failing with patch 18.  To make a long story short, I tried
various "tweaks" (or perhaps "hacks" is a better word) to get the virtual column ids for PRN0
to remain correct without having to clone SELECT's columns. In the end I couldn't find a clean
way to do so.

But I did find that if I made the following changes, things started working:

1) REMOVE code fragment "A" from SelectNode, as mentioned above.
2) REMOVE the following from SelectNode.java:

    for (int index = 0; index < size; index++) {
        ResultColumn rc = (ResultColumn) originalRCL.elementAt(index);
        /*
         * NOTE: We only care about window function columns that appear
         * here, and not about references into subquerys or similar.
         */
         if (!rc.expressionIsWindowFunction()) {
             windowFunctionRCL.setElementAt(
                prnRSN.getResultColumns().elementAt(srcindex), index);
             srcindex++;
         }
    }

3) REMOVE the following from ResultColumnList:

    /*
     * All expressions are not columns if this RCL contains
     *  - A windowfunction column
     *  - A virtual column node pointing to window function column
     * This check is done as a separate loop to avoid incorrectly start
     * marking any/parts of the base table columns as correlated columns.
     * Marking a VCN pointing into a WindowResultSet as correlated cause
     * resultset reference errors during code generation.
     */
    for (int index = 0; index < size; index++) {
       ResultColumn rc = (ResultColumn) elementAt(index);
       if (rc.isWindowFunction()){
           return false;
        }
    }

4) Change ProjectRestrictNode.java as follows:

 // Map the result columns to the source columns
- int[] mapArray = resultColumns.mapSourceColumns();
+ int[] mapArray =
    resultColumns.containsWindowFunctionResultColumn()
        ? childResult.getResultColumns().mapSourceColumns()
        : resultColumns.mapSourceColumns();

I *think* this change has the effect of saying "ignore PRN0's RCL if it points to a WindowNode,
because PRN0's RCL may not be correct--esp. it may have incorrect virtual column ids".  Thus
instead of using "[1,1]" for the projection mapping, PRN0 will create its mapping from PRNDUMMY,
which has correct virtual column ids of "[1,2]", so everything ends up okay.  At the same
time, we did *not* have to clone the SELECT's ResultColumnList objects, and that appears to
make the join queries run correctly.

With these changes, OLAP tests runs without error and the two join queries I posted previously
also execute without error.  And from the brief looking that I did it seems like the results
are correct, though it wouldn't hurt to double-check that.

All in all, though, I'd have to say this seems pretty hackish, so it may not be a proper solution.
 But it does cause all of the queries posted so far to pass, and it simplifies the changes
by removing the chunks of code mentioned above.  So I figured I'd post my findings anyways.
 I'll leave it up to you to determine if any of it is useful or worth pursing further...

> Add support for ROW_NUMBER() window function
> --------------------------------------------
>
>                 Key: DERBY-2998
>                 URL: https://issues.apache.org/jira/browse/DERBY-2998
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Thomas Nielsen
>            Assignee: Thomas Nielsen
>            Priority: Minor
>         Attachments: d2998-10.diff, d2998-10.stat, d2998-11.diff, d2998-12.diff, d2998-12.stat,
d2998-13.diff, d2998-13.stat, d2998-14.diff, d2998-14.stat, d2998-15.diff, d2998-15.stat,
d2998-16.diff, d2998-16.stat, d2998-17.diff, d2998-17.stat, d2998-18.diff, d2998-18.stat,
d2998-4.diff, d2998-4.stat, d2998-5.diff, d2998-5.stat, d2998-6.diff, d2998-6.stat, d2998-7.diff,
d2998-7.stat, d2998-8.diff, d2998-8.stat, d2998-9-derby.log, d2998-9.diff, d2998-9.stat, d2998-doc-1.diff,
d2998-doc-1.stat, d2998-test.diff, d2998-test.stat, d2998-test2.diff, d2998-test2.stat, d2998-test3.diff,
d2998-test3.stat, d2998-test4.diff, d2998-test4.stat, d2998-test6.diff, d2998-test7.diff,
d2998-test8.diff
>
>
> As part of implementing the overall OLAP Operations features of SQL (DERBY-581), implement
the ROW_NUMBER() window function.
> More information about this feature is available at http://wiki.apache.org/db-derby/OLAPRowNumber

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message